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THE CHOICE OF QUESTIONS 
ON ESSAY EXAMINATIONS 


GEORGE MEYER 
Psychological Laboratory, University of Michigan 


Quite a common practice when the essay type of examination 
is given at the college level is to allow the students a choice of ques- 
tions. Two factors seem for the most part to account for this prac- 
tice. The first of these has to do with the emotional attitude of the 
examinee during the test. Since the usual essay test contains rela- 
tively few questions, it is quite probable that the student will find 
at least one question about which he believes he knows little or 
nothing. In such a case an emotional attitude may be set up which 
will interfere with the recall of the materials which answer the other 
questions. It is quite probable that the choice of questions serves in 
some part to eliminate this form of inhibition. 

In testing achievement an examination attempts to discover 
either how much or how little any given individual knows about the 
subject being tested. Here, too, the small number of questions in 
the usual essay test enters the picture. Since the sampling of the 
material being tested is very limited, such a test, theoretically, may 
fail its purpose and as a result the grades given on the test and subse- 
quently grades given in a course dependent on such a test or tests may 
be in error. ‘The choice of questions supposedly remedies this dif- 
ficulty in part. Students believe it to be a procedure which enables 
them to show more adequately what they really do know. No doubt 
teachers who give such tests believe that this is the case. 

It is the purpose of the present paper to study the matter of 
choice of questions. Is this belief which is held by examiner and 
examinee that the choice of questions gives a more adequate picture 
of what the examinee knows about the subject-matter which is being 
tested a valid one? Such a belief is based on the assumption that 
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the student’s judgment concerning his knowledge is a reliable one. 
This assumption seems unwarranted, for it entirely neglects the fact 
that when the student makes a choice he is unknowingly gambling on 
factors over which he has little or no control. 

The matter of choice of questions really implies that the student 
can in some way decide, first, what the average score on the whole 
test is going to be and, secondly, what the average score on each 
individual question will be. This would involve, among other things, 
knowing the real difficulty of the question; knowing the scoring 
standards; and knowing who is to do the grading. 

The student, first of all, cannot know the real difficulty of each 
question. Even though the student believes he knows more about a 
question which he chooses than one he omits, the question which he 
chooses may be inherently a very difficult one and, although he does 
well on it with reference to other individuals who answer the same 
question, his absolute score on the test may be lower than if he had 
answered the omitted question. Secondly, the student cannot know 
the standards on which a question is to be scored. Hence, the same 
possibility of a lower absolute score is present as in the case of the 
preceding factor. In the third place, if the scoring is done by more 
than one individual, as it sometimes is at the college level (where one 
or more questions are read by different individuals), the student may 
choose a question which is graded by an individual with exceptionally 
high personal scoring standards. This would again tend to lower the 
individual’s absolute score. 

The following experiment provides some objective data on these 
matters. 


EXPERIMENT 


The Subjects —Two experimental groups were used. One group 
consisted of one hundred ninety-eight studerts in a course in abnormal 
psychology and the other consisted of one hundred three students in 
the second semester of a year course in general psychology at the 
University of Michigan. 

The Tests——The students in abnormal psychology were given a 
choice of four out of five questions on their hour mid-semester examina- 
tion. The test blanks read as follows: 


Directions.—Answer first the questions about which you think you know 
most. Indicate by labelling first, second, third, and fourth what you believe 
are your best, your second best, your third best, and your fourth best answers, 
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respectively. Then answer the question which you have not as yet answered. 

If you do better on this question than you did on some one or more of the other 

four, it will be counted in place of the question on which you did most poorly. 
1. Outline the differences and similarities between hypnosis and sleep. 


2. Outline the main principles of Freud’s psychology and its historical 
sources. 


3. Discuss Jung’s division of individuals into extraverts and introverts. 


4. What conclusions as to theories of mental pathology are forced by the 
facts of war shock? 


5. Develop a theory as to what happens in the nervous system when an 
individual is dissociated. Show how some case of dissociation would be 
explained by this theory. 


The students in the general psychology course were given a choice 
of four out of five questions on their three-hour final examination. 
The directions on the test blanks were the same as those above. 
The questions follow: 


1. Discuss centrally aroused sensations with reference to the following 
points: 

(a) Their differences from peripherally aroused sensations as shown by 
experimental studies. 


(b) The various types of concrete imagery and the relative dominance of 
these types. 


(c) Eidetic imagery and the various groups of individuals in which it is 
found. 
2. Discuss the physiological factors in depth (distance) perception. 


3. Compare and contrast the various theories of the perception of apparent 
motion. 


4. What are the three major forms of inhibition exhibited in the various 
memory processes? Outline the conditions which seem to be responsible for 
each form, wherever possible, citing specific experimental investigations. 


5. Desciibe as thoroughly as possible the inference process. How does 
attitude (direction) influence the process? 


Scoring keys were made up for each test and on the basis of these 
keys the tests were scored as objectively as possible. 

The Results —Although there were one hundred ninety-eight 
and one hundred three subjects, respectively, the two groups of sub- 
jects in the following sections, the groups will be treated as if they 
had contained only one hundred fifty-five and ninety-five subjects, 
respectively. This was made necessary because forty-three students 
in the abnormal psychology group and eight in the general psychology 
group did not answer all five questions. The reason for the larger 
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number in the first group lies in the fact that the test was much too 
long for an hour examination. However, it was because of this 
that the second group was used. It should be noted, however, that 
the following results probably fall somewhat short of the whole truth, 
as the writer suspects from a careful inspection of the individual 
test results that quite a number of the students in both groups who 
answered all five questions did not answer the question, which they 
would have omitted, as conscientiously as they answered the first four 
questions. 

The simplest evidence which throws light on the problem in hand 
is given by the percentages of individuals in the two groups who made 
better total scores on the examinations when the answer which they 
considered poorest was counted in place of the poorest of their four 
choices. In the abnormal psychology group sixty-two of the one 
hundred fifty-five individuals or forty per cent, and in the general 
psychology group forty-four of the ninety-five individuals or forty-six 
per cent made better scores when the answer which would have been 
omitted was counted in place of the poorest of their four choices. 

One may also compare the averages of the total scores on the four 
chosen answers with those on the four on which the highest total score 
was made. Table I gives such a comparison. These results indicate 


TABLE I.—RELIABILITY OF THE DIFFERENCES BETWEEN THE AVERAGES OF THB 
Four CHOSEN AND THE Four Best ANSWERS 
































Four chosen answers Four best answers ™ 
Group ee 
Mean | SD |SDm| Mean! SD | SDm| "0S 
Abnormal............ 67.82 | 8.52 .68 | 69.69 | 8.55 .69 | 1.93 
General..............| 82.34 | 18.15 | 1.86 | 86.55 | 16.20 | 1.66 | 1.69 





1 The difference between the means of the four chosen and four best answers 
divided by the standard deviation of the difference. 


that both groups of subjects made higher scores on the average when 
the four best answers were counted than when the answers to the 
four chosen questions were counted. When the reliability of the 
differences between the means is determined by the critical ratio 
technique it is seen that these ratios are not of sufficient magnitude for 
statistical significance. However, a rather definite trend is indicated. 

Both of the foregoing bits of evidence indicate that students when 
they have a choice of questions do not always omit the one on which 
they will make the lowest score. 
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In order to study the relationship of the students’ judgments as to 
their achievements on the various questions with their actual achieve- 
ments, coefficients of contingency and coefficients of correlation were 
computed for each question between the rating which the students 
gave their answers and a rating of their actual achievements on the 
questions. A rating for actual achievement was determined in the 
following manner: A standard score! was found for each of the four 
questions which the individual had chosen to answer. These were 
obtained by subtracting the mean score on a given question from the 
individual’s score on the same question and then dividing that devia- 
tion by the standard deviation of the distribution of the scores on that 
question. Thus, the first subject in the general psychology group who 
made scores of twenty-five, twenty-one, sixteen, and twenty-six on 
questions one, two, three, and four made the following standard scores 
on the four questions he chose to answer: +.99, —.03, +.65, and +.38. 

The foregoing standard scores are obtained by using Table II. 


TABLE IJ.—EXPERIMENTAL DaTA ON THE Four CHOSEN QUESTIONS 























General psychology Abnormal psychology 
Question 

N M SD N M SD 
1 78 19.44 5.62 109 17.79 3.20 
2 84 21.27 8.06 149 20.78 3.82 
3 58 11.75 6.50 148 16.16 3.25 
4 78 23.95 5.42 134 | 16.62 | 3.95 
5 82 23.26 | 5.30 80 | 13.79 | 4.96 











In Table II N is the number of individuals who answered a given 
question as one of the four choices, M is the average score made by the 
individuals who chose that question, and SD is the standard deviation 
of the distribution of those scores. Thus, the score twenty-five on 
the first question on the general psychology test is 5.56 points or .99 
of a standard deviation above the mean score on that question. Like- 
wise, the score of twenty-one on the second question is .27 of a point 
or —.03 of a standard deviation below the mean score on that question. 

After determining the standard scores of each individual on each 
of the four chosen questions in this fashion, ratings for actual achieve- 
ment were given on each question. This was accomplished by giving 





1z/SD, where z is the deviation of the individual score from the mean and SD 
is the standard deviation of the distribution. 
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the rating of one to the question on which an individual made the 
highest standard score, of two to the question on which he made the 
second highest standard score, of three to the one on which he made 
the next highest standard score, and of four to the question on which 
he made the lowest standard score. Thus, the subject who made the 
standard scores of +.99, —.03, +.65 and +.38 on questions one, two, 
three, and four, respectively, would receive an achievement rating 
of one on question one, of two on question three, of three on question 
four, and of four on question two. 

The relationship between the subject’s rating of his answer to a 
given question and his actual achievement rating was then determined 
by computing coefficients of contingency.! Pearson product-moment 
coefficients of correlation were also computed as a check. Table III 
gives these coefficients of contingency and their probable errors.? 


TaB.Le III.—CoeErFriciENTs OF CONTINGENCY BETWEEN THE SUBJECTIVE RATINGS 
AND AcTUAL RATINGS OF ACHIEVEMENT 





























General psychology Abnormal psychology 
Question..........] 1 2 3 4 5 1 2 3 4 5 
"SO 27} 44] 43] 44] 41) 133] .32| 37] 39) .37 
ecasanvosad + .07) + .05] + .07/ + .05] + .06) + .05) + .05) + .04/ + .05) + .06 

















The Pearson product-moment coefficients are not given as they show 
the same order but are on the average .10 lower. 

These data thus tend to show that there is a slight positive relation- 
ship between an individual’s rating of his achievement in answering 
an examination question and his actual achievement with reference 
to the average of the group. However, the fact that the relationship 
is so slight would tend to indicate that the student is not a very good 
judge of his knowledge in relation to the knowledge of the group. 
That is unfortunate, since it is the basis on which he is graded. 

Another method of treatment of the data is to discover what, if 
any, changes in grades on the examinations are brought about when 
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the individuals are given grades on the basis of their total scores on 
their four best answers rather than their four chosen answers. In 
order to find out what changes occurred it was necessary to change each 
individual’s total scores on the four chosen answers and on the four 
best answers into standard scores. These standard scores were 
obtained by subtracting the mean score on the four chosen answers (or 
four best answers) from the total scores on the chosen answers and 
best answers, respectively. This deviation from the mean was then 
divided by the standard deviation of the distribution in question.! 
Thus, two standard scores were obtained for each individual—a stand- 
ard score for his achievement on his four chosen answers and a 
standard score for his achievement on his four best answers. 

These standard scores were then transformed into grades in the 
following manner: Since twelve grades (A, A—, B+, B, B—, C+, 
C, C—, D+, D, D—, and E) were to Le given, the six standard devia- 
tions? of the normal distribution were divided equally among the 
twelve grades so that a grade of A was given to those individuals 
who made a standard score of +2.5 and above; A — to those who made 
a standard score between +2.0 and +2.5; B+ to those who made 
a standard score between +1.5 and +2.0; B to those who made a 
standard score between +1.0 and +1.5; B— to those who made 
a standard score between +0.5 and +1.0; C+ to those who made a 
standard score between 0.0 and +0.5, etc. 

To illustrate, take a subject in the abnormal group who made a 
total of seventy-two points on his four chosen answers and a total of 
seventy-seven points on his four best answers. His standard score is 
+.49% for his four chosen answers and +.85‘ for his four best answers. 
The grades corresponding to these standard scores would be C+ 
for the test scored on the basis of the four chosen questions and 
B— for the test scored on the basis of the four best answers. In this 
way two grades were computed for the test results of each individual 


who made a higher total score on four questions other than the 
chosen four. 





1 Table I gives the means and standard deviations. 


2 Only a very small percentage of the area of the normal curve is neglected under 
such circumstances. 
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Since there were forty-four individuals in the general psychology 
group and sixty-two individuals in the abnormal psychology group 
who made higher total scores on four questions other than the chosen 
four, the number and extent of the grade changes in these groups 
will give some indication as to the influence of choice of questions 
on grades. 


TaBLe I1V.—GrabDE CHANGES 





























General psychology | Abnormal psychology 
Group 

N N/44 | N/95 N N/62 | N/155 
One grade lower............... 2 .05 .02 l .02 .O1 
oe a da one ke 19 .43 .20 29 47 .18 
One grade higher.............. 22 .50 .23 22 .35 .14 
Two grades higher............. 1 .02 .O1 9 15 .06 
Three grades higher............ 0 .00 .00 1 .02 .O1 





Table IV gives the data on grade changes. Of the forty-four 
individuals in the general psychology group who made higher total 
scores on four questions other than the chosen four, a comparison of 
grades on the basis of standard scores shows that two individuals or 
five per cent obtained one grade lower (C to C— in one case and C— 
to D+ in the other). This was two per cent of the group (ninety- 
five cases) who attempted five questions. Nineteen individuals or 
forty-three per cent of the group which made higher total scores on 
questions other than the chosen four obtained the same grade for both 
types of scoring. This was twenty per cent of the group who tried to 
write on five questions. Twenty-two cases or fifty per cent of those 
who made higher total scores obtained one grade higher, e.g., C+ to 
B—. This was twenty-three per cent of the group which attempted 
all five questions. Only one individual, two per cent of the group 
which made higher scores, improved by two grades (in this case, 
DtoC—). This was only one per cent of the group which answered 
five questions. 

The figures for the abnormal psychology group are to be read in 
the same fashion. 

From both sets of data it seems clearly indicated that a con- 
siderable percentage of individuals taking this examination obtained 
better grades when their four best answers were counted than when the 
four chosen answers were counted. According to these results between 
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twenty and twenty-five per cent of the students would have received 
at least one grade higher. This, of course, means that other indi- 
viduals’ grades would be lower. 

In order to explain this improvement in score and the resulting 
changes in grades among these individuals, the results on each of the 
individual questions should be considered. Table II gives the 
number of cases answering each question, the mean score on each 
question and the standard deviations of the distribution of scores on 
each question for both the general psychology and abnormal psychol- 
ogy groups. 

Obviously, the third question in the general psychology examina- 
tion is considered a difficult one by the students since only fifty-eight 
of the ninety-five cases answered it. This is the smallest number of 
individuals answering any of the five questions. Not only is the 
third question considered the most difficult one by the students as 
indicated by the smaller number of individuals answering it, but it is 
the most difficult one when it is considered from the point of view of 
the average score obtained by those individuals who did answer it. 
In Table II it is clearly shown that the average score on the third 
question is considerably lower than on any of the other questions. 
Thus, the third question seems to be the most difficult one. 

That it is this question which is responsible for the lower total 
scores, and consequently lower grades, in the cases of those individ- 
uals who made lower scores and grades on the four chosen answers 
as compared with their four best answers is indicated by the following 
facts. Of the fifty-eight individuals who chose this question, forty-one 
made lower scores on this question than they made on the ques- 
tion which they would have omitted. Since only forty-four individ- 
uals in the general psychology group made a higher total score on 
four questions other than the chosen four, ninety-three per cent of the 
lower scores and grades on the four chosen questions are directly due 
to the choice of the third question. 

On the abnormal psychology examination more or less the same 
réle as played by the third question in the general psychology exami- 
nation is played by the fifth question. In this case the fifth question 
is answered by the smallest number of students, indicating that it is 
judged to be more difficult. At the same time it, too, has a lower 
average score than any of the other questions, indicating that it is 
more difficult than the other questions. Of the eighty individuals 
who answered this question, thirty-nine made lower scores on it than 
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on the question which they would have omitted. Since only sixty-two 
individuals in this group made a higher total score on four questions 
other than the chosen four, sixty-three per cent of the lower scores and 
grades on the four chosen questions owe their existence to the choice 
of the fifth question. 

The meaning of this evidence is clear-cut. The choice of a ques- 
tion which is inherently more difficult, or for which the scoring stand- 
ards are higher, or for which the individual scoring the question has 
higher standards—or a combination of these factors—can influence 
to a considerable degree the score and grade which a given individual 
obtains on the choice type of essay examination. 

Since in so many fields of study at the college level particularly 
the semester’s grade is determined almost entirely by the letter grade 
received on the final examination, this matter of choice of questions 
no doubt helps to make the letter grades for the semester all the more 
unreliable. According to the evidence presented, an individual could 
very easily receive a C grade in a course when he deserved a B (minuses 
and plusses are omitted in most grading systems at the college level) 
merely because he happened to choose a question which was more 
difficult in somé respect, so that, although he may have written one of 
the best answers on that particular question, his total score on which 
his grade is based does not reflect his real knowledge with reference to 
the group. Therefore, in view of the percentage of grade changes 
indicated in this study it seems rather unsound practice to continue 
the use of choices on essay examinations unless some technique is used 
so that the values of the questions are weighted according to their 
difficulty. 


SUMMARY AND CONCLUSIONS 


The matter of choice of questions on essay examinations was 
studied by using two large classes, one in general psychology and one 
in abnormal psychology. Both classes were given a five-question essay 
examination on which they were asked to write on four of the five 
questions and to indicate the order of excellence of their answers. 
Then they were to answer the fifth question, which they were told 
would be counted in place of the poorest of the answers they had 
chosen provided that the answer on this question would result in their 
obtaining a higher total score. Under these conditions the following 
conclusions are indicated: 
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1. Students do not always omit the question on which they will 
make the lowest score. This is shown by the following results: 

(a) Forty-six per cent of the subjects in the general psychology 
group and forty per cent of the subjects in the abnormal psychology 
group made higher total scores when the answer which would have 
been omitted was counted in place of the poorest of the four choices. 

(6) A comparison of the average total scores on the four chosen 
answers with the average total scores on the four best answers shows 
a definite, although not statistically significant, superiority for the 
four best answers in both groups. 

2. That students are unable to make accurate choices is probably 
due to the fact that they cannot judge their own knowledge with 
reference to the knowledge of the group. This is indicated by the 
coefficients of contingency and coefficients of correlation which were 
computed between the students’ ratings of their achievement on the 
four chosen questions and their actual achievement with reference 
to the group. The coefficients were all positive but low. 

3. On the basis of the four best answers changes in letter grades 
on the examination occur which reveal the grades under a choice of 
questions system may be more unreliable than where all questions 
are answered. This is indicated by the fact that twenty to twenty-five 
per cent of the individuals make better letter grades when their four 
best answers rather than their four chosen answers are counted. 
This change in grades was found to result in most part from the 
fact that these individuals had chosen a question which was either 
more difficult or scored more strictly than the question they would 
have omitted, so that the grade on the four best answers was probably 
a fairer one. 

4. In the light of this evidence it is suggested that unless the 
various questions are weighted in some suitable fashion the choice 
form of essay examination be discontinued. 














THE VALIDITY OF THE PORTEUS MAZE TEST 


S. D. PORTEUS 


Director of Psychological Clinic, University of Hawaii 


An article in the October 1937 issue of the JouRNAL or Epuca- 
TIONAL PsycHoLoGcy by Moshe Brill! reported some work done with 
the Porteus Maze, the subjects being one hundred inmates of the 
State Colony for Feebleminded Males at New Lisbon, N. J. Accord- 
ing to the author, the group consisted of “fifty socially well-adjusted 
and fifty seriously maladjusted mentally deficient boys.” Brill 
found that the maladjusted boys scored higher, on the average, than 
the well-adjusted. 

On the basis of these results he concludes that claims made for 
the Maze test, particularly as regards its use as a corrective to the 
Binet, are invalid. He further states that ‘earlier conclusions as to 
the validity of the test from the standpoint of social adaptation were 
not fully justified.” These conclusions were presumably contained in 
a study by Berry and myself, written in 1918 and published in 1920, 
from which he quotes as follows: 

“The Porteus Tests represent an attempt to evaluate socially 
valuable characteristics not fully tested by the Binet. These capaci- 
ties are mainly prudence, forethought, planning capacity, ability to 
improve with practice, and adaptability to a new situation. The 
deficiencies in these respects, even more than in intellectual attain- 
ments, distinguish high-grade defectives from normal children; 
hence the value of the tests for diagnostic purposes. They fulfil 
the requirements of supplementary tests because they are standardized 
and arranged in the form of a scale. They can be easily applied and 
they test highly correlated mental practices in simple situations, so 
that results may be readily interpreted.’’? 

Had Brill wished to do so, he might have quoted from a still 
earlier publication, in which after making certain criticisms of the 
Binet—criticisms at that time new, but now so generally accepted as 
to be commonplace—I stated that the new tests ‘‘would prove a 
valuable supplement to, and partial corrective of the Binet-Simon 
Scale.’’* 

There are possibly many things which I have written during the 
past twenty-five years that I would modify or retract, but these 


passages are not among them, for the claims therein made seem 
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quite capable of substantiation. If, however, Brill has new light to 
shed on the validity of these tests, these claims should certainly be 
reconsidered. 

In his article he gives the means of Binet test ages of the well- 
adjusted and the maladjusted groups as being 107.94 months and 
107.5 months, respectively, or about nine years. But the upper 
Binet limits are quoted as one hundred eighty-two months, or fifteen 
years two months, and one hundred eighty-seven months, or fifteen 
years seven months, respectively. In other words, somewhere about 
fifty per cent of his ‘“‘mentally deficient” cases ranged in mental age 
from nine to fifteen years plus! If, in addition, we consider that half 
of them were classified as being socially well-adjusted, what were they 
doing in a feebleminded institution? It would certainly seem as though 
the New Lisbon Colony required the services of a competent clinical 
psychologist to determine the mental status of its inmates. 

In the meantime, I would commend to Brill’s notice Goddard’s 
statement with regard to the diagnosis of mental deficiency. Writing 
in 1928, he said: ‘‘ Honesty and fairness impel us to raise the question: 
Is it possible that during all these years we have placed the limit of 
feeblemindedness too high? Is the real limit seven years instead of 
twelve?’ He then goes on to imply that the lower limit is correct, 
and in so doing arrived at a somewhat belated but perfectly safe 
conclusion. Note, of course, that it was a twelve-year limit and not a 
fifteen-year level he was discussing. 

Cyril Burt, seven years earlier, had reached a similar conclusion 
when he remarked that in an adult a Binet IQ of fifty instead of 
seventy should be considered diagnostic. The acceptance of the latter 
figure, as suggested by Terman, would, according to the army results, 
have stigmatized many millions of people in this country as mentally 
defective. 

For my own part, I have never accepted such a high critical level 
in mental diagnosis. It was because of my dissatisfaction with the 
accepted Binet standards that I devised the Maze tests as a supple- 
mentary and corrective measure. 

It seems clear that, whatever else they were, many of Brill’s 
subjects were certainly not feebleminded. If, as I suspect, they were 
merely delinquent, then I can readily understand his results. These 
would then merely confirm what I found sixteen years ago, in a 
similar comparison of the scores of individuals who did not adjust well 
to institution discipline. My conclusion was: 
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“The delinquent groups also show the same tendencies to score 
higher in the Porteus Maze than the Binet, the relative advantage in 
Porteus age being greater for the dull normal group than for the group 
at feebleminded levels.’’6 

But quite apart from the question as to the mental status of 
Brill’s subjects, there is another matter worthy of comment. He 
found that forty per cent of the well-adjusted and thirty-eight per cent 
of the maladjusted passed the adult Maze test. This is something that 
falls entirely outside of my experience. Out of the last one hundred 
delinquent cases, not feebleminded, who have been examined at the 
Clinic of the University of Hawaii, only four per cent passed the adult 
test. 

This is in such striking contrast to Brill’s figures that it strongly 
suggests that his subjects’ scores were in many cases based on a second 
or third application of the test. Under such circumstances, these 
scores would be entirely invalid owing to the effects of practice. If the 
tests are applied a second time, the procedure is to invert the designs, 
cease testing at a single failure and deduct one year for each second 
trial. Only in this way can the practice effects be offset.® 

That Brill’s scores are based on a second application of the tests 
is more likely in the light of an experience of my own when visiting 
the New Lisbon Colony some years ago. There I was shown a “‘games 
room” conducted by two of the brighter inmates and among the 
material I found my twelve and fourteen mazes, cut out of an illus- 
trated newspaper article and pasted on cardboard so that the other 
boys could practise finding their way out. It may well be that some 
of Brill’s subjects were already quite familiar with the test. 

In support of his criticisms of the Maze, Brill also quotes Garrett 
and Schneck to the effect that apart from an examination of many 
individual records, the evidence that the Maze measures aspects of 
character and temperament not tested by the Binet is exceedingly 
meager.’ It is equally true that the evidence for the Binet con- 
sidered as a measure of social adaptation is, “‘apart from an examina- 
tion of many individual records,” similarly scanty. 

Against this earlier comment should be balanced Louttit’s and 
Stackman’s recent summary of twenty-eight published investigations 
of the relation of the Maze and Binet test ages, in which they conclude: 
‘‘Our survey of evidence would appear strongly to corroborate Porteus’ 
own contention that the Maze test is a supplement to the Binet. 
Together the two tests give a better picture of a child’s performance 
ability than either by itself.’”® 
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On the question of the relative values of Binet and Maze in com- 
paring adjusted and maladjusted children, I would refer the reader to 
studies by Poull and Montgomery® and Karpeles,'® both of which 
confirm my own findings and so run entirely contrary to Brill’s con- 
clusions. Karpeles’ statement reads; ‘‘The present study tends to 
strengthen Porteus’ claim that scores are lower in cases of delinquency* 
and to confirm the findings made by Poull and Montgomery. It also 
indicates that in the upper levels of intelligence there is an even 
more marked tendency for socially maladjusted subjects to score 
higher in the Mazes and for the delinquents to show inferior scores in 
the Maze Tests.” 

The fact that the weight of evidence points in a given direction 
should, however, not prevent the publication of adverse findings. 
Brill’s study, though entirely inconclusive because of inadequate 
experimental conditions, served one useful purpose. It sent me back 
to the problem of providing further proof of the validity of the Maze. 
No one is more aware of the fact that the existing proof is meager—not 
only for the Maze but for all other tests as well. I had already pub- 
lished nineteen studies, but perhaps the twentieth might show entirely 
different results. I am, therefore, reporting briefly two further 
investigations, one with delinquents and the other with two groups of 
defectives, both studies stimulated by Brill’s effort. 

These delinquents, both girls and boys, represent one hundred 
consecutive clinic cases. Their average Binet and Maze test ages were 
first compared. The difference between the means of the test results 
is negligible. By the Binet they tested ten years ten months, and 
by the Maze ten years eight months. As their average chronological 
age is about sixteen years, it will be realized that these socially malad- 
justed individuals test at least three years below the average for their 
age. That the maze is not still lower than the Binet is probably due 
to the fact that owing to language difficulties the latter is already 
depressed. The average Binet IQ of fourteen-year-old Honolulu chil- 
dren is about eighty-five, whereas the Maze TQ is a little below 
one hundred. 

In forty-one per cent of these cases the Maze was above the Binet, 
in fifty-six per cent it was below. Thus deficiency in the traits tested 
by the Maze would appear to be more often associated with delin- 





*This refers to delinquents compared with well-adjusted individuals in a 
community, not to delinquents compared with well-adjusted feebleminded in an 
institution. These latter are, of course, not socially well-adjusted in the ordinary 
sense of the term. 
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quency than is a similar deficiency in the traits examined by the Binet. : 
Thirty per cent of the group tested below a test quotient of seventy ' 
with the Maze, twenty-nine per cent with the Binet. It would be : 
foolish, of course, to assume that a marked deficiency in the traits 

examined by either test includes all of the causal factors in delin- , 
quency. There is a host of other factors, chiefly environmental, as f 
well. The conclusion suggested by the above figures is, however, that c 
if it is worth while to apply a Binet test to delinquents, it is at least p 
equally worth while to apply the Maze. | 


If Terman’s suggested seventy IQ borderline is accepted, then ti 
twenty-nine per cent were feebleminded.!! Where the Maze is used ir 
as a supplementary test and both test verdicts are considered in tl 
diagnosis, then this number, with both test quotients less than seventy, le 
is reduced to seventeen per cent. This is a much more reasonable g1 
estimate, agreeing closely with the findings of other observers as to Ww 
the relation of mental deficiency to delinquency. Considering only m 
those with test quotients on both tests below sixty-five, then the of 
number of feebleminded by this criterion would equal seven per cent, di: 
an estimate agreeing with the most conservative conclusions. My an 
own practice now inclines towards considering all those who average or 
60 or less in the two tests combined as being mentally defective. This we 
number equals twelve per cent. Because of the comparatively higher an 
scores reached by children on the mainland, the critical average there use 
could probably be raised to seventy. In any case, the advantages of Bi 
supplementing a Binet test with the Maze would seem to be obvious. dia 
There are, however, objections to pooling test quotients, and a better the 
procedure is to consider the test verdicts separately. Diagnostic cor 
descriptions such as ‘“‘defective school learning capacity but normal 
planning capacity in practical situations” are then commonly made. Bin 

We may now turn to the studies of the mentally deficient. The edu 
subjects were inmates of the Waimano Home for Feebleminded in But 
Honolulu, and consisted of individuals belonging to all the racial sug; 
groups resident in Hawaii. Sixty-five cases resident at Waimano Bin 
Home who had been examined were selected from the Clinic files and for | 
their names submitted to Dr. Taylor, medical superintendent of the on t 
Home, who rated them on a ten-point industrial scale. This scale the) 
included those who were incapable of self-help up to those whom Dr. 4 
Taylor thought might be capable of self-support outside. As a matter simi 
of fact, the lowest three ratings—(1) incapable of self-help in dressing shoy 


and feeding, (2) capable of limited self-help, and (3) dresses and feeds Bine 
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without help—were not used, as all the children rated, except one, 
were capable of complete self-help. The ratings really began then 
with the fourth step—‘‘capable of simplest tasks beyond self-help.”’ 

The Binet test IQs were found to correlate .35 with Dr. Taylor’s 
ratings, the Maze test .59. As far as estimated industrial capacity of 
feebleminded is concerned, the Maze is thus seen to provide a closer 
correspondence. The correlation would have been larger had low 
grade cases been included. 

Dr. Taylor was then asked to suppose that his institution was 
to be closed and the inmates returned to the community in five groups 
in descending order, according to how he thought they could adapt 
themselves to society, rating those most ready for parole 5, and those 
least ready 1. Out of the whole body of inmates he selected five 
groups with ten individuals in each, picking out those individuals 
whom he knew longest and best. Thus his ratings represented a 
measure of social adaptability or degree of feeblemindedness. Out 
of this number thirty-two had been tested at the Clinic, and were 
distributed as follows: Seven in each of groups 1, 2 and 3, six in group 4, 
and five in group 5. These last individuals were retained on the list in 
order to present the full range of feeblemindedness, even though they 
were all low-grade cases and tested below four years both in the Binet 
and the Maze. As the numbers were small, the rank-order method was 
used in calculating the correlation, which proved to be .57 with the 
Binet and .77 for the Maze. These coefficients indicate that as a 
diagnostic measure of ability for self-management and self-support, 
the Maze test is more useful than the Binet. These results serve to 
confirm findings previously arrived at in Vineland. 

This does not mean that the Maze is as generally valuable as the 
Binet; that would indeed be a presumptuous claim. In the field of 
educational prognosis, the Binet is, of course, much more useful. 
But as Louttit and Stackman point out, there has never been any 
suggestion on my part that the Maze should be substituted for the 
Binet. Whatever comparisons have been instituted have been made 
for the purpose of demonstrating that the common practice of relying 
on the Binet alone for diagnosis has no justification in fact, and that 
the use of the Maze in conjunction with the Binet is certainly indicated. 

That the new Binet requires supplementing because it places a 
similar premium on language facility as did the old Stanford Binet is 
shown by a recent experience here. The average Terman-Merrill 
Binet IQ of sixty cases, consisting of three groups of nine-, ten- and 








178 The Journal of Educational Psychology 


eleven-year-old school children in an average district in Honolulu, 
was found to be eighty-two for each group, indicating that the new 
test was uniformly graded at too high a level for children with such 
language disabilities as Honolulu children possess. At these ages it is 
about eighteen months too difficult. Hence the necessity for using 
performance tests will continue. 

In conclusion I might state again that I place no exaggerated value 
on the Maze test. All that I claim for it is that it appears to be the 
best test available for testing prudence and planning capacity at or 
about normal levels. Though seven years per Binet may be the lower 
limit of normality, there is a borderland zone extending as regards 
mental age from seven to eleven years. This zone is populated by 
both normals and defectives. Their differentiation depends, there- 
fore, upon the proof of the possession of traits other than those tested 
by the Binet, and among the most important of these are prudence, 
forethought, planning capacity. Immediately some one devises a 
measure of these socially essential capacities that is better than the 
Maze, I would join most heartily in condemnation of the old test and 
advocacy of the new. At the present writing I see no reason to change 


my practice. 


LIST OF REFERENCES 


1. Brill, Moshe: ‘‘The Validity of the Porteus Maze Test.” J. of Educ. Psych., 
Oct., 1937, p. 491. 
2. Berry, R. J. A. and Porteus, S. D.: Intelligence and Social Valuation. Publica- 
tions of The Training School at Vineland, N. J., No. 20, May, 1920. 
3. Porteus, S. D.: ‘‘Mental Tests for Feebleminded: A New Series.” J. of 
Psycho-Asthenics, June, 1915, pp. 201, 202. 
4. Gopparp, H. H.: ‘‘Feeblemindedness: A Question of Definition.”” Proc. Am. 
Assoc. for Study of Feebleminded, June, 1928. 
5. Burt, Cyril: Mental and Scholastic Tests. P.S. King & Son, London, 1921. 
6. Porteus, S. D.: Studies in Mental Deviations. Publications of The Training 
School at Vineland, N. J., No. 24, October, 1922, pp. 102, 112. 
. Garrett and Schneck: Psychological Tests: Methods and Results, pp. 84, 85. 
. Louttit, C. M. and Stackman, H.: “‘The Relationship Between Porteus Maze 
and Binet Test Performance.” J. of Educ. Psych., January, 1936. 
9. Poull, Louise E. and Montgomery, Ruth P.: ‘“‘The Porteus Maze Test as a 
Discriminative Measure in Delinquency.” J. of Appld. Psych., April, 1929. 
10. Karpeles, Lotta M.: ‘‘A Further Investigation of the Porteus Maze as a Dis- 
criminative Measure in Delinquency.” J. of Appld. Psych., August, 1932. 
11. Terman, Lewis M.: The Measurement of Intelligence. Houghton Mifflin Co., 


p. 81. 


on 


ab: 
BUC 
tic 
tha 
un 
Of 
to 1 
ver’ 
bee 
inte 
side 
verb 
to n 
used 
sittir 
at eg 
in th 


categ 
times 
Stand 








Ze 


8 
29. 


THE RELATION OF VERBAL ABILITY TO 
IMPROVEMENT WITH PRACTICE 
IN VERBAL TESTS 


HERBERT WOODROW 


University of Illinois 


The problem considered in this paper is whether, if two tests have 
a common factor, practice in one will transfer to the other. If the 
common factor is a narrow, relatively specific factor, it is perhaps 
reasonable to suppose that transference would occur. The answer is 
quite problematical, however, if the common factor is a broad ability, 
an ability, that is, upon which depend the scores of a large number of 
tests. Transference between tests having in common one such broad 
ability should occur if practice in one or more tests of that ability 
produced a gain in the amount of that ability possessed by the sub- 
jects. Failure to find transference, on the other hand, would indicate 
lack of improvement with practice in the common ability, and, 
further, would demonstrate that the improvement noted as a result 
of the practice must have some other explanation. 

The answer to this problem of the relation of broad or general 
abilities to transference no doubt depends upon a number of variables, 
such as the age of the subjects, number and nature of the tests prac- 
ticed, length of practice, etc. Moreover, it should not be assumed 
that the answer is the same in the case of all abilities. An adequate 
understanding of the problem may require years of experimentation. 
Of the several experiments which have been made as a first approach 
to this problem, the one to be here described, while involving only a 
very small amount of practice, deals with a factor which has often 
been regarded as one of the most important; namely, verbal 
intelligence. 

A group of sixty-five subjects was given practice in two tests con- 
sidered to be tests of verbal ability or verbal intelligence; namely, 
verbal analogies and categorical anagrams, that is, rearranging letters 
to make words of a specified category. Different test forms were 
used at every practice period. While the practice lasted only ten 
sittings, and each test was practiced only from five to seven minutes 
at each sitting, it was sufficient to produce a gain of forty-one per cent 
in the analogies test and one of fifty and five-tenths per cent in the 
categorical anagrams test. The mean gain in the former was thirteen 
times its standard deviation and in the latter over twenty times its 
standard deviation. It should be stated that two additional tests 
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were practiced; namely, horizontal adding and digit cancellation. 
It is believed that this latter fact may be ignored as having little 
bearing on the conclusions here reported. In addition, end tests were 
given both to the practice group of sixty-five subjects and to a control 
group of sixty-two subjects. 

To determine which of the end tests have a factor in common with 
the practice tests some method of factor analysis must be used. In the 
present report reliance is placed on the tetrad difference method. 
Scrutiny of the intercorrelations revealed five tests yielding fifteen 
tetrad differences! none of which were greater than 1.03 times their 
standard deviation.? Consequently these five tests may be assumed 
to have one and only one common factor. They serve to identify 
the common factor and will be termed reference tests. The five refer- 
ence tests which gave these insignificant tetrad differences were the 
following: initial scores of the two practiced tests; namely, verbal 
analogies and categorical anagrams, and initial scores of the following 
three end tests: artificial language; narrative completion; and prov- 
erbs—the latter two tests being taken from the Otis Advanced Intel- 
ligence Examination. From the nature of the tests, it would seem 
clear that the common factor must be what is ordinarily termed 
verbality or verbal intelligence. The categorical anagrams test also 
correlated with a second broad factor, but the remaining reference 
- tests had no significant loadings with any other factor than the one 
here termed verbal intelligence. All the test score intercorrelations,* 
including the intercorrelation of the five reference tests, are given in 
Table I. 





1 The three tetrad differences for any four tests, 1, 2, 3, 4, may be written as 
follows: 
tiesas = Tie%sa — TisTo4 
tisos = TisTos — Tidhs: 
tisae = TisT24 — TidT 23 


* The formula used for the standard deviation of any tetrad difference was that 
given by Holzinger, Preliminary Report on the Spearman-Holzinger Unitary Tratt 
Study, Univ. of Chicago, No. 2, p. 13. 

For proof of this statement, see Spearman, The Abilities of Man, 1927, 
Appendix, iii. 

‘Since the reliability coefficient of the verbal analogies test increased with 
practice from +.828 to +.945, the correlations with the final analogies test score 
have been corrected to the magnitude they would have, had the reliability remained 





unchanged at +.828, by multiplying the obtained r’s by ~/ .828/ +/ .945. Inthe 
case of the categorical anagrams, no correction was made, since the reliability of 
both initial and final scores was +.89. 
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TABLE I.—TeEst INTERCORRELATIONS 
A. Intercorrelations of Reference Tests 

















2 3 4 5 
SL. as vebeaeetedwaes eneoneeen wade 507 | 447 | 530 | 457 
seine eiahik tiki HEE UE SYS KEENE pe OS ON ... | 520 | 554 | 323 
EE ere oe bece 1 te 
66 0466 da ca ek wae tan Gee be ee ek ee Oek ER lose 2 ees Boece ee 
Ls iu cen coe es Sele ea oe oa bee ee ee ek 








B. Remaining Intercorrelations 





6 7 8 9 10 11 12 





1. Artificial language.............. 457 | 513 | 530 | 364 | 570 | 523 | 448 
Ee 470 | 469 | 630 | 527 | 636 | 452 | 428 
3. Narrative completion........... 318 | 283 | 425 | 284 | 395 | 453 | 345 
ERE, SE 657 | 431 | 723 | 527 | 747 | 627 | 291 
I ts dak ne oe i he 4 oe 256 | 561 | 419 | 386 | 326 | 344 | 823 
6. Rearranging words.............. ... | 447 | 631 | 475 | 612 | 464 | 377 
7. Rearranging syllables........... ... | ... | 486 | 412 | 400 | 343 | 596 
Ps trcceeeebédcednsecedl vos 2 kas D caw fn ee oe ee 
NS oon ob sae oko e wwe ee sae Bane | eee Bean Pe ee ee 
ee nod B waco | oon © wes T aon Se 
BD. Wes ORONONNOS.... 26.6 ces ccaes hae © ee6 B ace Bene Eh cox © eee 
12. Final anagrams ................ 


























Test 1 was an American Council for Education test, 1930 edition. 

Tests 2, 3, and 9 were Otis Intelligence Tests, Advanced Examination. 

Test 8 was a list of seventy-five choice-of-five opposites. 

Test 7 consisted in rearranging, by numbering, syllables to make words of four, 
five, or six syllables. 

Test 10 consisted in checking short lists of words so as to indicate to which of 
two categories the words belonged, these categories being illustrated by two other 
short lists of words. 

Tests 4 and 11 were especially constructed lists of seventy-five choice-of-five 
analogies. No analogy used in the practice lists was included in the initial and 
final tests. 

Tests 5 and 12 consisted in rearranging letters to make words, the words 
constituting lists, of ten each, of six specified categories. 


The correlation of each of the five reference tests with the common 
factor was calculated from their intercorrelations, in the usual way.! 





1 By formulae of the sort, 


Tiois + iia + iis + ishia + Tishis + Tidis 





ri, = 


Tes + Tee + es + 34 +135 + as 
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The correlation of the remaining end tests with this factor was cal- 
culated in a similar manner, after first calculating all the tetrads 
between any one of them and the five reference tests, and rejecting 
any test which, in any combination with the others, gave a tetrad 
difference larger than two and one-half times its standard deviation. 
The correlations thus obtained between each of the tests and the 
common factor of the five reference tests are shown in Table II. 


TaBLE II.—Loap1nGs (r) WITH THE COMMON FacTOR 











Test r Tests used for r 
Practice tests 
Be IID, oo nc ciiccediwessesnceseceacones .714 | 1, 2, 3, 4, 5 
Bs SII, 6 os ois nese ce deeds eadawedccinsen .657 | 1, 2, 3, 5, 9, 11 
i, III, oo ook dc nd ta cesioesaesencaniaes .475 | 1, 2, 3, 4, 5 
PPT ETUTTETETUTE LETTE .568 | 1, 2, 3, 4, 9, 12 
End tests 
DD... cv cddeicwddaleeeenecesesaens .732 | 1, 2, 3, 4, 5 
cca ane es b 6k RARER OOS KERN E .739 | 1, 2, 3, 4, 5 
De I INO «oon icc cesessveseercccess .650 | 1, 2, 3, 4, 5 
EI oo on ccd sachs wecerernceneees .575 | 1, 2, 3, 5, 6 
FD PE BI oa onic cktveencsececceesies .594 | 1, 2, 3, 4,7 
Re kk cepaee heen aaseeaderewnse sec enw .761 | 1, 2, 3, 5,8 
FETE EOE PTE ETT Ce TET TTL CT .628 | 1, 2, 3, 4, 5, 9 
10. Categories (Thurstone)...............eeeeeeees . 737 1, 2, 3, 5, 10 








Table II clearly establishes that both the practice tests and the 
end tests were heavily dependent upon the common verbal factor. 
The eight end tests considered as a battery constituted an excellent 
test of this factor, since the correlation of their weighted sum with the 


common factor! is no less than +.94. 


The factor correlations (or loadings) given in Table II almost 
completely account for the intercorrelations of the reference tests (on 
the usual assumption that the amount of correlation between any 
two tests produced by one factor is the product of the correlations of 
the tests with that factor). The residuals, that is, the differences 
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1 Calculated by the formula rp, = (9) in which rpg, indicates the corre- 





n 
3 
lation between the pool of tests and the common factor and S = LI i = r | 
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x standing for each of the tests in turn. See Holzinger, K. J., op. cit., No. 2, p. 12. 
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between the original correlations (given in Table I) and the correla- 
tions accounted for by the common factor, have a mean value of 
+.001, with a standard deviation from the mean of .029. They 
range only from +.053 to —.033, and none of them are significant. 

A comparison of the gains made by the practice and control groups 
in the eight end tests is afforded by Table III. While the initial 
scores of the two groups, given in columns P; and C,, were not identical 
in the case of any of the tests, the discrepancies were small—much too 
small to permit fear that the absolute increments shown by the final 
scores over the initial scores might be incomparable because of possible 
differences in the significance of the units at different parts of the 
scale. That the increments are frequently negative is due to the 
fact that, to ensure that no subject would have more than enough time 
to complete the test and thus render impossible a determination of 
his proper score, the time allowed at the final trial was shorter than 
at the initial trial in all tests except artificial language (for which, the 
time was eleven minutes at both trials). The values given in the 
column headed Pz -- Ce show whether the mean gain of the practice 
group (Pc) was greater or less than that of the control group (Ce). 
The standard deviation of the mean of the gains of each group was 
calculated directly from the list of the gains, and the standard devia- 
tion of the difference between the two means (column headed ¢@,,,, ) 
was then calculated by the usual formula for the standard deviation 
of the difference between the means of two uncorrelated lists of 
values. The significance of the difference between the mean gains of 
the two groups, or the critical ratio, is shown in the last column of the 
table, headed Diff. /c4;,,. 

Table III makes it entirely obvious that the end tests, which cor- 
related highly with a factor common to the practice tests, did not all 
show greater improvement in the case of the practice group than in 
that of the control group. In fact, only one half of them did so. 
Furthermore, it is doubtful whether the gain of the practice group was 
in any case significantly different from that of the control group. The 
critical ratios vary only from +1.68 to —2.07 and average +.08. 
According to the usual criterion, then, none of the differences in mean 
gain between the practice and control groups was significant. It is 
possible that, with larger groups and more reliable means, significant 
transfer might be discovered from practice such as here given to 
opposites and, perhaps, even to rearranging words and artificial 
language. On the other hand, the largest significance ratio, 2.07, 
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was negative, so that so far as the present experiment goes, the most 
significant difference in the gain of the two groups was in favor of the 


control group. | 


TaBLE IJ].—CoMPARISON OF THE PERFORMANCES OF THE PRACTICE AND CONTROL 









































Groups 

P; and C; stand for the mean initial scores of the practice and control groups 
respectively. , 

Pe and Cg stand for the mean gain of the practice and control groups, respec- 
tively. 

Values in the column Pg — Cg show whether the practice group gained more or 
less than the control group. airt, means the o of the difference Pg — Co. v 
4 
Practice Control 8 

group group 
Test Pe — Co | eaitt. Diff /euitt. iy 
ir 
Pr Pe C1} Ce t] 
— 4 
1. Artificial language............ 28.1)/+18.4|25.7| 415.5] +2.97 | 1.94 +1.53 m 
DB, PRRs accccgscccccscesveess 10.8;— .2)10.6)+ .3) — .53 . 56 — .95 ° 
3. Narrative completion.......... 14.6;— .9|/15.2;}— 1.6) + .73 .79 + .92 it 
6. Rearranging words............ 40.1]/— 7.7|37.2)}— 8.8] +1.14 1.09 +1.05 fa 
7. Rearranging syllables.......... 34.9/+ 5.8/32.6)+ 6.8) —1.02 1.76 — .58 ° 

 icicakocvckhaunn ons 47.1;+ 1.2/46.4/— .6| +1.76 | 1.05 +1.68 in 
9. Similarities..............+.-.. 13.1|— 3.5|13.3|— 3.0| — .53 58 — .91 tr: 
10. Categories (Thurstone)........ 103.2} —33.8/97.3|—24.8) —9.04 4.36 —2.07 co 
Mean = + .08 th 
th 
Now, if the marked improvement in the practiced tests had been cal 
accompanied by an improvement in their common factor, that is, by the 
an increase in the mean amount of verbal ability possessed by the is { 
subjects composing the practice group, the mean gain of the practice rel; 
group should have been greater than that of the control group in all or ma 
nearly all of the end tests. Since superiority of gain in the end tests pra 
on the part of the practice group over the control group is conspicu- not 
ously absent, the experiment clearly establishes that the gain in the pos 
tests practiced by the practice group entailed no appreciable improve- ver| 
ment in the general factor, that is, no increase, on the average, in the pres 
amount of verbal intelligence possessed by the members of the practice eith 
group. Likewise, it follows that the mean gain in the case of the abil 
practice tests could not be due to any increase in verbal ability con- 
ther 


sidered as a general factor. 
The experiment affords no positive answer to the question of brou 


what factors did bring about the improvement in the practice test plau 
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means. That the improvement was not due to an increase in speed 
conceived as a general factor is indicated by the fact that in a test of 
speed of making crosses, used as a transfer test, the control group 
showed a greater gain than the practice group (or more accurately, a 
smaller loss with shortened test-duration). 

It is possible that some inference concerning the cause of the 
mean improvement with practice may be drawn from the effect of 
practice upon the correlation of the practiced tests with the common 
factor of verbal ability. In the present experiment very little change 
with practice occurred in the correlations between the practiced tests 
and verbal ability. Thus, in the case of analogies, the correlation 
showed a slight decrease, from .714 to .657, while in the case of the 
categorical anagrams test, the correlation with the general factor 
increased slightly, from +.475 to +.568. Certainly the changes in 
the correlations are both too small to be regarded as verifiable. Since 
a correlation between two sets of scores is based on deviations from two 
means and may not be affected by changes in the means themselves, 
it follows that changes in the correlation of test scores with a general 
factor do not permit conclusions pertaining to the question of change 
in the mean amount of that factor possessed by the subjects. Only a 
transference experiment such as has been described will do that. The 
correlations of the scores with the general factor do indicate, though, 
the relative importance of that factor in determining the deviations of 
the individual scores from the mean, that is, in determining what are 
called the standard scores. When, then, as in the present experiment, 
these correlations show little change, about the only conclusion justified 
is that verbal ability did not with practice increase in its importance 
relative to other factors, as a determinant of the subjects’ scores. It 
may be concluded, then, that the improvement which occurred with 
practice in the present experiment in verbal intelligence test scores did 
not involve either an increase in the amount of verbal intelligence 
possessed by the subjects nor an increase in the relative importance of 
verbal intelligence as a determiner of score. The same conclusion 
presumably holds as regards Spearman’s g factor, since the g factor is 
either identical with, or included in, the factor here termed verbal 
ability. 

This conclusion does not exclude the possibility that with practice 
there may have been increases in the degree to which the subjects 
brought their verbal intelligence into play. In fact, it seems rather 
plausible that the increase in mean score was due mainly to such 
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changes. If such increments in the utilization of verbal ability 
occurred, however, in view of the lack of transference they must have 
been tied up with the particular test performances practiced. More- 
over, they could have had little correlation with the general verbal 
ability, since increments in a factor correlated with this ability should 
result in transference in the case of a battery of end tests which 
measured that ability. It follows that if increments in the utilization 
of verbal ability occurred, such increments formed a part of what in 
factor-analysis are termed specific abilities. In other words, the dis- 
tinction between specific and general ability may depend not so much 
on the ultimate nature of the ability, as upon the extent to which the 
subject is able to summon this ability to his aid in all those task 
performances in which the ability is useful. That portion which he 
can make use of in any one of a large number of tasks constitutes a 
broad or general ability, whereas an additional and uncorrelated 
portion which he can use in only one task may be conceived as con- 
stituting, at least in part, his specific ability for that onetask. This 
interesting possibility is not eliminated by the obtained data. 

The experiment here reported should be interpreted, then, not as 
proving that the subjects did not with practice learn better to apply 
their verbal ability to the tasks practiced, but as proving that they 
acquired no increments of free or general verbal ability, increments, 
that is, which they could make effective in the performance of the 
end tests. The effect of practice in the present experiment was to 
produce expertness along specific lines, rather than increases in general 
ability, the general ability involved being a verbal ability. The 
determination of the situation in the case of other group factors must 
await the analysis of further experimental data. 
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THE INFLUENCE OF THE KINAESTHETIC FACTOR IN 
THE PERCEPTION OF SYMBOLS IN PARTIAL 
READING DISABILITY! 


ARTHUR BERMAN 
Institute of Child Welfare, University of California 


This study attempts to answer the following questions: Does the 
addition of a kinaesthetic-tactual stimulus, in the form of manual 
motor movements added to visual and auditory stimuli, aid in the 
recognition (perception) and retention of nonsense syllables? Does 
the tracing of a nonsense syllable insure greater economy in learning 
and longer retention than a procedure which ignores manual tracing? 
Does this hold for more perceptual non-associative material such as 
geometrical figures? 

The literature dealing with the experimental phase of word recogni- 
tion in comparative therepeutic methods is scant.? Kirk* compared 
the relative efficacy of the manual-tracing method with the con- 
ventional sight method. He used six subjects of low mentality with 
intelligence quotients ranging from 63 to 80. Each subject was 
required to learn a list of five three-letter words by the manual-tracing 
method on one day, and another list of five words by the sight method 
on alternate days. A comparison of the efficacy of the two methods 
for first recall and relearning was made. 

Fildes‘ investigated the subject’s ability to recognize words, letters, 
and figures previously taught as these were involved in the reading 
process. Her work included visual and auditory discrimination (recog- 
nition) and the ability to associate the two. 

In an unpublished study Keller® attempted to measure the influence 
of manual-tracing upon word recognition. Three-hundred eighty-four 
subjects were required to write daily all words not readily recognized. 





1This study was done at the University of California at Los Angeles. 

2 Early studies on the perceptual phase of reading, e.g., Cattell, Dodge and 
Erdmann, Goldscheider and Miller, etc., are not specific to this paper. 

* Kirk, Samuel A.: ‘‘ The Influence of Manual Tracing on the Learning of Simple 
Words in the Case of Subnormal Boys.” Jour. Educ. Psychol., Vol. xxv, 1933, pp. 
525-535. 

‘ Fildes, Lucy G.: ‘‘A Psychological Inquiry Into the Nature of the Condition 
Known as Congenital Word Blindness.”’ Brain, Vol. xurv, 1921, pp. 286-307. 

5 Keller, Helen B.: Remedial Work on Disability in Word Recognition. Univer- 
sity of California at Los Angeles. Unpublished. Abstract in Betts, E. H.: 
Elem. Eng. Rev., Vol. x11, 1935, p. 25. 
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These words were subsequently checked for recognition at intervals— 
at the beginning and at the end of the semester. All phonetic drill 
and analysis were eliminated. A comparison of records for the two 
methods was made. 

Keller,! in reporting the activities of the special classes for non- 
readers in the Los Angeles city schools utilizing the manual-tracing 
method, offers data on a study in word recognition. In four class- 
rooms, totaling seventy-eight cases and twenty thousand eight hundred 
seventy words, eighty-six per cent of all words not recognized in the 
original reading were retained according to later tests; seventy-one 
per cent having been written only once. 

A summary of the daily teaching records of an adjustment room 
(eighteen cases) in the Los Angeles city schools for the year 1927-1928, 
indicates that five thousand six hundred forty-two words of five 
thousand nine hundred eighty which had been written by the pupil 
were recognized at a later date, or ninety-four per cent.’ 


EXPERIMENTAL METHOD? 


Subjects for-Experiments I and II were seventeen elementary 
pupils of the average age, nine years nine months, with a range from 
eight years two months to fifteen years one month. Although the 
number of subjects in the two experiments was similar, the subjects 
themselves were in part dissimilar. These subjects were obtained 
from the Sawtelle Boulevard Elementary School, Los Angeles, and the 
Psychological Clinic of the University of California at Los Angeles. 
The subjects at the elementary school were selected in the following 
manner: All records dealing with individuals who in any way indicated 
reading difficulty were examined; those showing at least a two-year 
retardation in vocabulary grade placement were segregated. This 
group was further checked for normal vision, audition, and intelligence. 
Any doubt as to eligibility of the subject served to eliminate him. 





1 Keller, Helen B.: ‘Special Classes for Non-readers.” Fourth Yearbook, 
the Division of Psychol. and Educ. Res., Los Angeles City Schools, 1931, Pub. 211, 
pp. 99-105. 

2 Keller, Helen B.: ‘‘Remedial Reading in the Los Angeles City Schools.” 
Mental Measur. Monog., Vol. x1, 1936, pp. 31-32. 

? This study was profitably divided into two analagous sections, hereafter 
known as Experiment I and Experiment II. The distinction being made upon a 
difference in learning material—nonsense syllables in Experiment I and geomet- 
rical figures in Experiment II. 
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The remaining subjects were selected at random from the University 
clinic subject to the criteria established. 

It is to be noted that all subjects were of normal intelligence 
(mean IQ 102.3). Many showed a history of long standing reading 
disability with school failure and emotional concomitants. All cases 
either had at one time learned by the kinaesthetic-tactual method or 
were at present utilizing this technique. 

The learning materials used in this experiment (Exp. I) were 
nonsense syllables. The criterion at all times in the use of these 
nonsense syllables was similarity to the reading process without the 
actual use of words. In this way an attempt was made to control the 
apperceptive background of each individual, to facilitate presentation, 
and to aid in analysis of the subject’s perceptive attack. 

The upper quintile of Glaze’s' elaborate list of nonsense syllables 
showing the highest degree of association value was selected. All 
syllables unpronounceable were eliminated. ‘Those syllables were 
also eliminated which were similar in visual or auditory form to any 
recognized English word of common usage. 

Of the original upper quintile group comprising five hundred seven 
syllables, twenty-six were eliminated by the first review, sixteen by the 
second. ‘The remainder was shuffled and ninety syllables selected by 
chance; seventy as an actual learning test, ten as preliminary trials, 
and ten as alternates. 

Each stimulus syllable was written in script with black crayon 
upon a white card four by six inchesin area. Letters were between one 
and two inches in height to aid in visual discrimination and to facilitate 
manual-tracing. 

Recognition charts were devised containing fifty syllables per 
chart partitioned into five sections of ten syllables each. Nine sylla- 
bles of each section were formed by varying or substituting letters 
of the stimulus symbol. Thus if the stimulus syllable was “fer,” 
the variations might be ref, per, erf, fes, or other combinations. 
Each section contained a horizontal reversal of the stimulus, a sub- 
stitution of a letter of like configuration, the substitution of a total 
syllable of like configuration, and an auditory stimulus of similar 
pattern to the stimulus syllable. The tenth syllable in each section 
was the stimulus itself. 





1 Glaze, J. A.: ‘‘ Association Value of Nonsense Syllables,” Jour. Genet. Psychol., 
Vol. xxxv, 1928, pp. 255-269. 
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The position of each of the ten syllables comprising a section was 
determined by chance; the position of each section on the recognition 
chart was also determined by chance. Five sections, or fifty syllables, 
were always presented as a unit and called Chart I, Chart II, ... 
Chart XIV. Charts were drawn on stiff white paper and checked for 
legibility. : 

The subjects were divided at random into two groups: Group A 
and Group B. Each subject learned the entire list of seventy sylla- 
bles; Group A utilizing a visual-audito-motor (VAM) method the first 
day, and a visual-audito method the second, alternating in this 
sequence until the entire list had been learned. Group B, in direct 
contrast, initiated the trials with a visual-audito procedure the first 
day and a visual-audito-motor method the second, continuing in 
that sequence. Membership in one group was not of greater impor- 
tance than that in the other since each subject learned the entire list, 
one day with the addition of manual-tracing and the next day without. 
The data of this experiment are based upon the activity of all subjects 
upon both methods of learning. 

Preliminary trials were found necessary and given to each indi- 
vidual with identical technique and materials as used in the true 
experimental sessions. The subject was left free to select his per- 
ceptive attack with both stimulus card and recognition chart. 

The subject, facing the experimenter, was seated at a small table 
so situated as to give good lighting. The following directions were 
given to the visual-audito-motor learners: 


I am going to show you a card with a word on it. I will say the word 
first, then you look at the word and trace it with your finger saying it exactly 
as I did. Be sure you say and trace the word together and finish both at the 
same time. Look at the word carefully! Don’t hurry! I will show you five 
cards one after another. Then I will ask you to find each of the five words 
you traced on a chart which has a lot more words on it. You will have to 
choose the word you traced from ten others which look almost like it, but 
only one isthe same. I will keep showing you the cards until you can find all 
five words. Find them as fast as you can. Do you understand? [Repeat 
if necessary.] 


The visual-audito learners were given identical instructions with 
the exception that all reference to manual-tracing was eliminated. 
Both sets of directions were given to all subjects, the sequence of 
presentation depending upon the initial method of learning. 

The order of presentation of each of the five cards in the series was 
determined by chance. Identical progressions were few, precluding 
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the possibility of the subject mastering the learning problem through 
primacy or position in the series. The cards were turned over in 
front of the subject, one at a time; the experimenter pronounced the 
word, then the subject either scrutinized and pronounced it, or 
scrutinized, pronounced, and traced it with his finger, according to 
his method of learning. After the exposure of the five cards, the 
recognition chart was presented. The subject selected the recognitive 
syllable he thought to be correct in each section; the section was 
then quickly covered. Following the blanketing of the section the 
subject was given knowledge of results: Right or wrong. This con- 
tinued until five choices had been made. This entire procedure was 
repeated until all five stimulus syllables were correctly recognized at 
that sitting. Ten syllables, or two series, were learned in one experi- 
mental period. Criterion of learning was one successful trial with a 
temporal presentation period of five seconds—timed with a stop-watch. 
Twenty-four hours later the subject was again presented with the 
respective recognition chart and was asked to select the stimulus 
symbols successfully identified the previous day. 

Trials to learn for each series, that is, the number of presentations 
required for recognition, and delayed recognition (24 hr.) were used 
as indices of learning. Immediate recognition (in the sense of a 
single immediate successful discrimination) was not used as an index 
of recognition. It was found that because of the shortness of the 
series learned in any single setting, initially correct recognitions were 
unreliable. This was substantiated in that many individuals gave a 
correct response on the initial trial but failed to repeat the response 
on the second, indicating that success was a chance occurrence. 

Experiment II differed little from the foregoing Experiment I in 
procedure with the exception of the necessity of varying method to 
adapt to new material.! 

Geometrical figures employed in this experiment are those used by 
Gates in his Reading Diagnosis Tests? and described further in his 
Improvement of Reading.* These figures comprise Test VI, B:— 
“Selection of Geometrical Figures,” of Gates’ ‘‘Tests of Visual Per- 
ception.”” This test consists of forty-two plane geometrical figures 





1 The similarity between Experiment I and II precludes the necessity of repeat- 
ing methodological details already given. 

* Gates, Arthur I.: Reading Diagnosis Tests, Bureau of Publications, Teachers 
College, Columbia University, New York, 1933. 

? Gates, Arthur I.: Improvement of Reading, The Macmillan Company, New 
York, 1927, pp. 440. 
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(capable of being manually traced), and two hundred ten like figures, 
forty-two of which are similar to the stimulus figures; the subject 
selects the one figure from the group of five which appears identical 
with the stimulus figure. 


TABLE I1.—ToTaL DIFFERENCES PER SUBJECT BETWEEN METHODS OF LEARNING— 
TRIALS TO LEARN AND DELAYED RECOGNITION 


(Exp. I. Nons. Syll.) 





Trials to learn 





Delayed recognition 




















Case Diff.? Diff. 
VAM! VA? VAM-VA VAM VA VAM-VA 
A 35 43 — 8 16 17 1 
B 17 18 — 1 27 31 —4 
C 29 22 7 23 21 2 
D 33 43 —10 23 22 1 
E 18 22 —4 20 21 —1 
FP 24 34 —10 29 26 3 
G 25 19 6 28 25 3 
H 18 22 — 4 19 23 4 
I 32 36 — 4 20 21 —1 
J 11 11 0 32 27 5 
K 45 50 — § 25 23 2 
L 20 24 —A4 23 27 —4 
M 16 17 —- 1 24 22 2 
N 15 21 — 6 18 25 —7 
O 16 14 2 22 22 0 
P 20 18 2 22 18 4 
Q 18 16 2 27 26 1 
(—38) (+1) 











1 Denotes visual-audito-motor learners. 

2 Denotes visual-audito learners. 

3 A+ quantity indicates a greater number of trials to learn for the VAM learners 
and thus inferior achievement to the VA learners. Consequently a — quantity 
implies greater achievement for the VAM learners. 

4A + quantity denotes a greater number of successful recognitions for the 
VAM learners and thus greater achievement than the VA learners. A minus score 


would indicate the direct opposite. 


Recognition charts were constructed by dividing the actual Gates’ 
test mentioned above into eight series of twenty-five figures each. 
Each series, that is, twenty-five figures, with stimulus figures deleted, 
were mounted upon white cardboard five by seven and one-half 
inches in area. 





1 Ibid., p. 396. 
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Since only forty (eight series) out of forty-two stimulus figures 
were used, the remaining figures served as preliminary trials. 


RESULTS 


Table I indicates the total differences per subject for each method 
of learning, “‘trials to learn” and “delayed recognition.”” That is, 
the superiority or inferiority of the manual-tracing method for each 
individual is recorded and then summated. The obtained differences, 
ranging from —10 to +7 for “trials to learn”’ and from —7 to +5 for 
“delayed recognition,” are subtle and must be carefully interpreted. 
With ‘‘trials to learn’ as a criterion of learning it can be seen that 
the more trials a subject requires to identify correctly the symbols, 
the less efficient is his learning. The greater the number of successful 
delayed recognitions, the more efficient the learning. 

Table I shows a surplus of negative “trials to learn’? (—38) for 
the column headed VAM-VA, indicating fewer ‘“‘trials to learn’ for 
the visual-audito-motor (VAM) method, or eight per cent improve- 
ment. No significant surplus for either method (+1) was seen for 
delayed recognition. 

A summary of the statistical treatment accorded the data of 
Experiment I is given in Table II. All figures in this table are based 
upon seven series, or thirty-five symbols (one-half the entire list). 
The grouping of the series in this manner serves to eliminate minor 
fluctuations which might enter the statistical treatment of few items. 


TaBLE IJ].—SratTisticaL SUMMARY 
Exp. I. (Nons. Syll.)! 











Mean SE mean Range SD Diff. be- 
tween SE CR 
means diff. 
VAM! VA |VAM| VA | VAM! VA | VAM; VA 
Trials to learn....| 23.06/25.29) 2.10 | 2.7 | 11-45/11-50| 8.69 |11.15) 2.24 1.22 {1.84 
Delayed recogni- 
re 23.47|23.3 .97 .87| 17-32)18-31) 4.0 3.51 .17 .21 81 






































1 Scores for VAM and VA learners, given above, are each based upon seven series, or thirty-five 


words, since each subject learned one-half the list by one method and the remaining half by the other. 


The mean for the visual-audito-motor (VAM) learners, ‘‘trials to 
learn,” was 23.06 + 2.10, with a standard deviation 8.69 + 1.48; 
for the visual-audito (VA) learners 25.29 + 2.7 with standard devi- 
ation of 11.15 + 1.92. The range: VAM was 11-45; VA 11-50. The 
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actual obtained difference between means for ‘‘trials to learn’’ was 
2.24. The reliability of the difference between the means was found 
by the formula: 





1 
Catt = Vorw, + ou, _ 2rom om, 


The resulting standard error difference was 1.22. The diff./SE diff., 
or critical ratio, was 1.84, which according to Garret? represents 
ninety-six (plus) chances in one hundred that the obtained differences 
between the means for the two methods is a true one. 

Means and standard errors for twenty-four hour “delayed recogni- 
tion” were: For the VAM learners 23.47 + .97; for the VA learners 
23.30 + 85. Standard deviations, in the order named, were 4.0 + 71 
and 3.51 + .60. The ranges for the two groups, VAM and VA, 
approximate each other; the former being 17-32, the latter 18-31. 
The obtained difference between means was .17. The standard error 
difference was .21. Critical ratio was found to be .81. 

The consistency of the recognition charts as measuring instruments 
was found by applying the Spearman-Brown Prophecy Formula.? 
The correlation of the ‘‘ whole,” or reliability, was .89+.06. 

The data of Experiment II were treated in the same manner as those 
of Experiment I. The tables of Experiment II occupy an analogous 
position to that of Experiment I and all qualifying conditions referring 
to the tables of the latter hold for the present. 

The interpretation of Table III is arrived at in the same manner 
as Table I. The superiority or inferiority of the manual-tracing 
method for each individual ranges from —7 to +3 for “trials to 
learn,” and from —6 to +4 for “delayed recognition.”” The sum- 
mation of the former, —18, is a ten per cent improvement of the 
visual-motor learners over the visual; and the summation of the latter, 
—5, is a two per cent decrease in efficiency as compared to the visual 
method. : 

Table IV summarizes the statistical treatment of the data of 
Experiment II. Means and standard errors, “trials to learn,” for 
the visual-motor (VM) and visual (V) learners were, respectively, 
9.7 + .65 and 10.76 + .92. Standard deviations, and standard errors 
of the standard deviations, were: VM, 2.68 + .46; V, 3.8 + .65. 


1 Guilford, J. P.: Psychometric Methods. McGraw-Hill Book Co., Inc., New 
York, 1936, p. 59. 

2 Garret, Henry E.: Statistics in Psychology and Education. Longmans, Green 
and Company, New York, 1933, Table XIV, p. 134. 

* Ibid., p. 271. 
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Some difference was evident in the ranges of the two groups: VM, 6-16; 
V, 5-21. The obtained difference between means was 1.06. Standard 
error difference was .39; the critical ratio 2.72. These figures repre- 
sent data from four series, or twenty geometrical figures, for each 
method of learning. 


TaBLeE III.—TotTau DIFFERENCES PER SUBJECT BETWEEN METHODS OF LEARNING 
—TRIALS TO LEARN AND DELAYED RECOGNITION 
(Exp. II. Geom. Fig.) 














Trials to learn Delayed recognition 
Case Diff.* Diff.‘ 
VM: v? — | vi V — 
I 12 13 —- 1 13 12 l 
H $f) 11 — 2 18 15 3 
B 8 10 — 2 17 17 0 
G 7 7 0 19 16 3 
A 14 21 — 7 13 17 —4 
C 12 9 3 20 17 3 
F 10 A) 1 18 17 1 
J 6 8 —- 2 15 18 —3 
Q 6 5 l 18 20 —2 
E 10 8 2 17 15 2 
R 9 7 2 16 18 —2 
S 8 12 — 4 19 19 0 
T 11 12 — | 11 17 —6 
V 11 12 - 1 14 19 —5 
X 16 17 —- 1 17 14 3 
U 7 10 — 3 19 18 1 
Y 9 12 —- 3 17 17 0 
(—18) (—5) 























1 Denotes visual-motor learners. 
? Denotes visual learners. 

3 Cf. footnote 3, Table I. 

‘ Thid., footnote 4. 


Means and standard error of the means for the two groups of 
learners, ‘delayed recognition,”’ were: VM, 16.46 + .58;V, 16.88 + .48. 
Standard deviations and standard errors of these deviations were, 
respectively, 2.4 + .41and 1.99 + .382. The ranges for the two groups 
approximate each other: VM, 13-20; V, 12-20. Obtained difference 
between means was .42; the standard error difference .27; the critical 
ratio 1.55. 

The reliability of the recognition charts, by the Spearman-Brown 
Prophecy Formula, was .77 + .10. 
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TaBLE I[V.—SraTIsTICAL SUMMARY 
Exp. II (Geom. Fig.)! 











Mean SE mean Range 8D : 
Diff. 
between BE CR 
means diff. 
VM{| V |VM/;} V |VM/i V |VMji V 
Trials to learn........ 9.7 |10.76| .65 | .92 | 6-16] 5-21) 2.68) 3.8 1.06 .39 |2.72 
Delayed recognition. ..|16.46)16.88) .58 | .48 |13-20|12-20) 2.4 | 1.99 .42 .27 11.55 






































1 Scores for VM and V learners, given above, are each based upon twenty geometrical figures 
since each subject learned one-half the list by one method and the remaining half by the other. 


SUMMARY OF RESULTS 


Experiment I.—(1) The visual-audito-motor (VAM) learners were 
superior to the visual-audito (VA) in “trials to learn,” since an excess 
of thirty-eight trials to learn were required by the latter group on a 
list of thirty-five nonsense syllables. A difference between means of 
2.24, in favor of the VAM learners, was not statistically significant 
since the critical ratio was only 1.84. 

(2) Practically no difference between the two methods of learning 
was found for twenty-four hour delayed recognition (retention). The 
excess of one additional correct recognition for the visual-audito- 
motor learners was negligible; the difference between means of .17 
was also insignificant. 

(3) Reliability of the test (recognition charts) was .89 + .06 by 
the Spearman-Brown Prophecy Formula. 

(4) Summation of the differences of individual results on each 
method shows the visual-audito-motor (VAM) learners to be superior 
in trials to learn by eight per cent; no superiority was seen for delayed 
recognition. 

Experiment II.—(1) Superiority was again manifested for the motor 
element since the visual-motor (VM) learners were superior when 
geometrical figures were used. Eighteen fewer total trials to learn 
were recorded for seventeen subjects on a list of twenty geometrical 
figures. The difference between means was of the order of 1.06 in 
favor of the above named group. Reliability of the manifested superi- 
ority of the group was high since the critical ratio, 2.72, approached 
maximum reliability. 

(2) No striking differences appear in twenty-four hour delayed 
recognition (retention) for the two methods of learning. The critical 

ratio, 1.55 was not statistically significant. 
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(3) Reliability of the test (recognition charts) was found to be 
77 + .10 by the Spearman-Brown Prophecy Formula. 

(4) Summation of the differences of individual results on each 
method shows the visual-motor (VM) learners to be superior by ten 


per cent for trials to learn and inferior by two per cent for delayed 
recognition. 


DISCUSSION OF RESULTS AND CONCLUSIONS 


The preceding sections, at first glance, seem to indicate that the 
manual tracing method (visual-audito-motor and _ visual-motor 
learners) was superior in “trials to learn” (acquisition) for both 
nonsense syllables and geometrical figures. No superiority was seen 
for retention (delayed recognition). However, the most serious con- 
sideration involved in the interpretation of this study is the number 
of cases handled. All obtained differences between means for the two 
methods, with one exception, were not statistically significant. But 
what does this mean? It means theoretically that the probable 
amount of error due to sampling is greater than the obtained differences 
between the means. The point, is, that actual significant differences 
were obtained for “trials to learn,” in both experiments, but were 
deemed not highly significant by the application of quantitative 
formulae which apply only to large numbers of cases and not to a 
selected qualitative group as this study represents. Garret believes 
... “the significance of a measure of reliability is conditioned upon 
our having a sufficiently large number of cases. If N [the number of 
cases] is less than twenty-five, there is little sense or justification in 
using reliability measures.’’"! To use no measures of reliability would 
be as unwise as placing too much stress upon them. But the process 
of finding critical ratios for the data of each of his six subjects, as 
Kirk? has done, and placing absolute finality on these measures of 
reliability is unwise and an overuse of statistical measures. The same 
would hold for seventeen cases. It must be remembered, in addition, 
that although the mean differences may appear small here, where the 
lists of symbols are not large, under actual reading conditions, where 
learning is more extensive, they may be quite important. Consistent 
with this philosophy it seems advisable, until further evidence is 
available, to stress the actual difference magnitudes which appear 


for each individual, and the superiority of the means of the total 
activity of the subjects. 





1 Garret, Henry E.: Op. cit., p. 142. 
? Kirk, Samual A.: Op. cit. 
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To recapitulate—the data in this paper seem to indicate that 
greater economy in the acquisition of nonsense syllables and geo- 
metrical figures was had for our partial reading disability cases when a 
manual-tracing technique was used. Retention did not seem to be 
aided by this tracing factor. 

The inclusion of the kinaesthetic-tactual, or manual-tracing pro- 
cedure, in a program of remedial reading for partial reading disability 
cases, can often be beneficial. It is a commonplace that this factor is 
widely included in most successful methods of treating reading dis- 
ability without the recognition of its true value. In light of this it 
would seem that a purely visual remedial technique as reported by 
Traxler,'! although successful, might very well have been more so with 
the addition of the kinaesthetic-tactual factor. This is not to be 
construed to mean that every disability case is to be subjected to 
manual-tracing. Certain factors in the constellation may make this 
unnecessary, or even to be avoided. Among these may be the subject’s 
emotional attitude toward tracing, and the ability to learn more aptly 
through other means. Manual-tracing used in an incorrect manner 
may act as a distraction and interfere with learning. By all means, 
the pupil should be subjected to this kinaesthetic-tactual factor to 
determine its effect on the recognitive process. We can see that 
Witty and Kopel have misinterpreted when they criticise the Fernald- 
Keller technique in the following manner: ‘‘A pertinent question is 
why children should be exposed to such a method (e.g. the Fernald- 
Keller kinaesthetic technique) which stresses predominantly one 
sensory inlet, and avoids the use of those channels which, it is assumed, 
are relatively deficient. The writers believe that a defensible teaching 
method would utilize every sensory channel as each makes a natural 
contribution to the learning process.”? The point is that the kinaes- 
thetic channel, which is usually neglected, is added to other sensory 
channels and all operate in the learning process thus facilitating 
learning. Non-readers have learned to read because of this addition. 
When the subject learns without tracing, it is discontinued. The 
question of a deficiency in a visual or other sense mode does not 
concern us. 





1 Traxler, A. E.: ‘‘An Experiment in Teaching Corrective Reading to Eight 
Seventh-Grade Pupils.” Jour. Educ. Res., Vol. xxrx, 1935, pp. 247-253. 

2 Witty, Paul, and Kopel, David: ‘Causation and Diagnosis in Reading Dis- 
ability.”” Jour. of Educ. Psychol., Vol. 11, 1936, pp. 169-170. 
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THE DREAMS AND WISHES OF ELEMENTARY- 
SCHOOL CHILDREN 


PAUL WITTY AND DAVID KOPEL 


Northwestern University 


This study reflects maturation levels in several aspects of the 
emotional development of three thousand four hundred elementary- 
school children. Under the direction of the writers, teachers con- 
ducted individual interviews with the children in all grades of an entire 
suburban school system, to obtain systematic information concerning 
children’s experiential backgrounds, preferred interests and activities, 
social relationships, and other behavior manifestations. The teachers 
were guided in this work by the informal use of the Witty-Kopel 
Interest Inventory.! 

Some of the information derived from this investigation has been 
reported in a previous paper.? This report is limited to data revealing 
characteristic wishes and dreams of Evanston public-school boys and 
girls in age categories from five to fourteen years. 


CHILDREN’S WISHES 


One of the questions asked of children during the administration of 
the Interest Inventory was: ‘‘Suppose you could have three wishes 
which might come true, what would be your first wish? Second wish? 
Third wish?” The responses were classified empirically according to 
the ostensible desires which were expressed. Analysis (for this report) 
of the responses was limited to wishes declared by samples of fifty 
boys and fifty girls from the kindergarten and each of the eight 
grades—a total of nine hundred children selected at random from a 
school population of three thousand four hundred. The boys expressed 
eight hundred forty wishes, and the girls eight hundred thirty-three 
wishes—an average of nearly two wishes for each child in the group. 

The preponderance of children’s wishes in practically every grade 
is for recreational equipment of various kinds (tools, books, toys, pets, 
etc.). For boys, wealth, travel, and proficiency in a skilled profession 
rank next in the order named. Girls place travel above wealth and 





1 The Interest Inventory and Manual of Directions may be obtained from the 
Northwestern University Psycho-Educational Clinic, Evanston, Illinois. 

* Witty, Paul and Kopel, David: ‘‘Studies of the Activities and Preferences of 
School Children.” Educational Administration and Supervision, Vol. xxrv, 
September, 1938, pp. 429-441. 


199 














200 The Journal of Educational Psychology 


follow with proficiency in the arts. These wishes maintain high ranks 
for both sexes throughout the grades. Frequently expressed by the 
girls in the primary grades are wishes for certain kinds of clothing. 
Girls in many grades wish for fame, leadership, or high social position. 
Wishes for success or proficiency in school are mentioned infrequently 
by boys and girls and only in the upper grades. Other wishes (rela- 
tively few in number) relate to desires for further schooling, health, 
strength, power, entertainment, friendship, freedom from present 
responsibilities, security, and personal or parental happiness. 

These results may be compared with the wishes reported by Jersild, 
Markey, and Jersild! in a study of four hundred public- and private- 
school children in New York. Although the categories employed in 
the two investigations for classifying wishes differ somewhat, the 
present study corroborates many items in the other study. In both 
studies the preponderant expression of wishes is for ‘‘ material objects 
and possessions.” In this category were included thirty-seven per 
cent of the wishes of the Evanston children, thirty-two per cent of 
the New York private-school children, and fifty-two per cent of the 
New York public-school children. 

The disparities in these percentages are easily explained. Wishes 
frequently are reflections of desires to possess objects associated with 
affluence or high economic status. For example, modern technology 
has created certain products which ostensibly reveal this status; 
advertising reaches all income groups and creates almost universal 
demands for some of these products. Obviously the lower income 
groups have a greater number of unsatisfied desires for material 
things. This fact appears to account for the disparate percentages 
noted by Jersild in the material wishes of public- and of private-school 
children. Since the homes of Evanston children are of a relatively 
privileged socio-economic status, the wishes of these youngsters 
approximate more closely those of private- than of public-school 
children. It is interesting to note that similar differences between 
these groups (Evanston and New York) exist in the relative incidence 
of dreams about ‘‘toys, food, money, etc.” 

A rather high degree of consistency is displayed in the types of 
wishes expressed by children throughout the elementary-school age 
range. Moreover, the wishes of girls and boys correspond rather 





1 Jersild, Arthur T., Markey, Frances V., and Jersild, Catherine L.: Children’s 
Fears, Dreams, Wishes, Daydreams, Likes, Dislikes, Pleasant and Unpleasant 
Memories. Child Development Monographs, No. 12. New York: Teachers 
College, Columbia University, Bureau of Publications, 1933. 
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closely. The greater maturity of older children is revealed, however, 
as Jersild notes, in the somewhat more inclusive and general nature of 
wishes as contrasted with wishes for specific objects. The character of 
the wishes illustrates ‘“‘the empirical nature of children’s concepts. 
The children’s thoughts [at all ages] are directed toward accomplished 
objective facts [and toward the acquisition of objects] rather than 
toward the possession of powers within themselves which would enable 
them to win the things they desire.”’ 

Thirty-nine per cent of the boys in the present study and forty-four 
per cent of the girls report having told their wishes to someone. 
About forty per cent of the entire group report that their wishes have 
come true.' The tendency to communicate wishes to others increases 
steadily throughout the grades (especially for girls). In all grades 
wishes are told to the parents, especially to the mother, by almost 
two-thirds of the boys and girls. Friends comprise twenty per cent 
of the communicants for both groups, and siblings and close relatives 
form the small remainder. Teachers rarely achieve the intimacy 
and rapport with children necessary to become their confidants in 
this respect. 

Perhaps more significant than the group analyses and averages of 
the frequency and types of children’s wishes (for individual guidance) 
is the rather frequent indication in individual expressions of wishes 
that emotional conflicts, anxieties, and fears are basic in producing 
poor orientation in and out of school. Thus, one youngster vouchsafed 
promptly his first wish: That his twin brother would cease beating 
him all the time! Investigation revealed the fear-laden existence of 
this child who was enduring continual physical persecution by his 
heavier and stronger twin. This condition was an important factor, 
of course, in producing maladjustment. Knowledge of the sibling 
relationship was instrumental in alleviating the situation and in 
effecting a condition wherein the formerly abused child was treated 
humanely by his brother; thereafter the school adjustment improved 
noticeably. Another (not atypical) child, wished that his father 
“had a job” and that his mother would “get well.”” This information 
helped the teacher to understand the boy’s basic need for sympathy 
and security, and to see the ‘‘remedial reading”? problem which he 
was said to present as subordinate to many other serious obstacles to 
wholesome development. The school, which is frequently impotent 
in alleviating directly many undesirable social and economic condi- 





1 No attempt has been made to analyze the nature of the wishes which did and 
those which did not eventuate. 
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tions, should make special efforts to maintain the morale of under- 
privileged children by giving them a measure of sensed and appropriate 
achievement. 


CHILDREN’S DREAMS 


Information concerning the frequency of dreams was obtained 
from nearly all the children in the seven elementary schools of Evans- 
ton, Illinois (District 75). One thousand seven hundred fifty-seven 
boys and one thousand six hundred forty-seven girls, rather evenly 
distributed among the eight grades and kindergarten, were included in 
the study. Of the entire group, sixteen per cent reported that they 
“never” dream, sixty-three per cent dream ‘“‘sometimes,”’ and twenty- 
one per cent dream “‘often.”” Twenty-one per cent of the boys and 
the same percentage of the girls dream ‘“‘often.”’ Seventeen per cent 
of the boys and fourteen per cent of the girls ‘‘never”’ dream. Some- 
what more girls than boys—sixty-four per cent as compared with sixty 
per cent—dream ‘‘sometimes.’”’ The tendency for girls in slightly 
larger numbers than boys to report the (occasional, as well as frequent) 
occurrence of dreams is evident throughout the grades; this tendency 
appears also in Jersild’s report. A grade distribution indicates that 
the reported incidence of dreams among boys and girls decreases 
somewhat as they grow older. Dream manifestations, however, are 
common to most children in the elementary school. 

Investigation of the nature of children’s dreams was limited to a 
study of the items reported by a sample of one hundred children— 
fifty boys and fifty girls—selected at random from the membership in 
each of Grades I through VII of the various schools. Thus seven 
hundred children reported five hundred fifty-nine dreams; these were 
grouped in twenty-seven categories according to the empirical classi- 
fication used in the study by Jersild, Markey, and Jersild cited above! 
(see Table I). 

The diversity of the children’s dreams is one of their characteristics. 
Most frequently included in a single group (eighteen and four-tenths 





1In order to make the percentages reported in the two studies comparable, 
it was necessary to eliminate in the present analysis the many responses which 
could not be classified in any of the twenty-seven dream categories. This condi- 
tion occurred because the Evanston children were not prompted and urged, as 
were the New York children, to explain and amplify their first responses. Hence, 
ten per cent of all responses were ‘‘unintelligible,” twelve and three-tenths per cent 
referred to “‘no dreams,” and nine per cent expressed ‘‘inability to remember the 
dream.”’ In the Jersild study these groups comprised only four and one-tenth 
per cent of the total responses—a proportion too small to necessitate a revision of 
his original figures. 
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TABLE I.—DREAMS OF ELEMENTARY-SCHOOL CHILDREN 














Evanston Evanston ~ el 
; Boys and girls} __. 
Categories of dreams Boys,|Girls, Chil- 
per | per dren, 
cent | cent Fre- | Per | per 
quency | cent | cent 
1. Possession of toys, food, money, etc........ 4.5 a 30 5.4, 7.4 
2. Flying and levitation..................... 1.4 4 7 3 
3. Travel, diversions, adventure, amusements, 
ee ee rere 14.8) 8.8) 65 11.6) 9.5 
4. Everyday events, objects, persons......... 20.5) 16.6) 103 18.4) 9.5 
5. Association with relatives and friends......| 1.9} 6.1 25 4.1) 3.6 
6. Marriage and parenthood................. 4 7 3 .5| 1.4 
7. Beneficent elves, fairies, and magic happen- 
tec ret sineabatevehthcanke bene dks 2.3] 6. 26 4.7) 3.0 
8. Prestige, achievement, independence...... 3.7 5.1 25 4.4, 4.7 
9. Benefits for relatives, altruistic activities...| ....| . Pr 1.3 
re 9.5) 8.5) 50 8.9} 2.8 
11. Sensory forms, colors and designs.......... 4 3 2 4 3 
12. Poverty, loss, breakage of property........ 8 7 4 7; 1.4 
13. Embarrassing reprimands and _ guilty 
ain don Diasti mb ate kts ek eke h-08 a ce 1 2 612 
14. Being powerless, losing flesh.............. 2.3} 1.0 9 1.6) 2.6 
15. Movies, stories of mystery, murder, etc....} 3.8 7 12 2.2} 5.7 
16. Apparitions, terrifying sights, nightmares..| 3.0 7 10 1.8) 4.1 
17. Strange people and places, the dark, etc....| 2.3) 7.1 27 4.9) 7.3 
iS ainlnds't nah Aiea Pa ee hake ee 1.3 
19. Successful fights, escapes, riddance of un- 
ofan ods e biid kere d 56 ee 8} 1.4 6 1.1} 2.7 
20. Falling, being in high places.............. 4.9| 3.0} 22 5.9} 1.7 
Bl SN, CUI, cc ccc cc ecnacnecsvasnnns .4 2 4| 2.2 
22. Loss, death, sickness of parents or other 
ie tn re hice ea AA Od eee ee 8 7 4 a4 3.1 
23. Other misfortunes befalling relatives and 
de Mart ia Sean sa nee ee edd eck 4) 2.0 7 1.3 2 
24. Accidents, injuries, punishment, fighting, etc.; 6.8) 4.4) 31 5.6} 8.7 
25. Being chased, threatened................. 7.2} 11.9) 54 9.7| 6.4 
ET Te 4.9) 3.0) 22 3.9} 2.2 
27. Fires, storms, catastrophes................| 3.4) 2.7 17 3.0} 2.9 
a ng decane aekd we 644e-e8-s 82 jods 
ee ee ee 100 wer 
30. Can’t remember.............. 73 
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per cent of all dreams) were dreams of ‘‘every-day events, objects, and 
persons.”’ Of course, the influence of children’s recent diurnal behav- 
ior is evident in many dreams classified in other categories. Categories 
containing more than four per cent of the children’s dreams follow in 
frequency order: 


Per Cent 

Travel, adventure, amusement, play...................20.. 11.6 
er re Leer eer 9.7 , 

EE Oe re err tre 8.9 
Accidents, injuries, punishment, fighting, etc................ 5.6 | 
Possession of toys, food, money, etc.................0000 eee 5.4 
Strange people and places, the dark, etc.................... 4.9 
Beneficent fairies and happenings .....................00-- 4.7 ‘ 
Individual prestige, achievement, or independence........... 4.4 f 
Association with relatives or friends........................ 4.1 ; 
The two groups of dreams mentioned first in this list are accorded I 
‘ 


first rank in the Jersild study. Indeed, seven of the ten groups men- 
tioned above are ranked by Jersild as the seven most frequently men- 
tioned classes of dreams. In most categories the percentages reported l; 
in the two inveStigations are similar, although in general the percent- t 
ages are somewhat higher in Jersild’s study than in this investigation. d 
This difference is particularly noticeable in several categories disclosing J 
unpleasant dreams; e.g., poverty, loss, breakage of property; embar- e 
rassing reprimands and guilty behavior; being powerless, losing flesh; v 


movies, stories of mystery, murder, etc.; apparitions, terrifying sights, ir 
nightmares; strange people and places, the dark, etc.; successful e 
fights, escapes, riddance of unpleasantness; collisions, diving; loss, tl 
death, sickness of parents or other relatives; and accidents, injuries, Sl 
punishment and fighting. cc 
Several interesting disparities appeared in the two studies. Thus re 
the Evanston children report no dreams revealing “benefits for ay 
relatives or other [similar] altruistic activities.” Of the New York pI 
children, one and three-tenths per cent mentioned dreams included in SO 
this category. Jersild reports a similar percentage (1.3) of dreams dc 
relating to ‘‘noises.’”’ None of the Evanston children mentioned this an 
item. Evanston undeniably is a very quiet city; it would be unwar- th 
- ranted to suggest, however, that it provides for its children no experi- ch 
ences in philanthropy or altruism. co 
Among the classes of dreams reported in greater frequency by hu 
Evanston than by New York children are the following: Everyday art 


objects, events, and persons; beneficent fairies and magic happenings; 
movies and stories (not fear-inspiring); falling from or being in high 195 
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places; minor misfortunes befalling relatives and others; being chased 
or threatened; and ghosts or bogeys. 

As in the Jersild study, minor differences between the dreams of 
older and younger children appear, but, ‘‘on the whole, the general 
similarity of dreams reported by young and old is more noteworthy 
than the difference.”’ 

The two studies coincide also in revealing the essential similarity 
in the dreams of boys and of girls of these ages. Evanston boys, like 
the New York boys, dream more often than the girls about bodily 
injury and accidents and about falling. In addition, the Evanston 
boys report more frequently than girls terrifying dreams about ghosts 
and bogeys, apparitions, mysteries and murders, and ‘‘ powerless 
feelings.’”’ The girls report more dreams about beneficent elves and 
magic happenings, association with friends and relatives, strange 
people and places, minor misfortunes to relatives and others, and 
“being chased or threatened.” 

Concerning the causes and functions of dreams, a vast literature, 
largely conjectural in nature, has been written. The data reported in 
this paper, like Jersild’s, ‘‘do not seem to substantiate the theory that 
dreams are primarily a form of wish fulfillment.’”’” We agree with 
Jersild that: “‘The content of dreams is likely to reflect any experi- 
ence, wish, fear, fancy, or circumstance which occurs in the child’s 
waking moments, and dreams are likely to reflect unpleasant events 
in the child’s experience somewhat more frequently than pleasant 
events.”’ Certainly, some children’s dreams support Freud’s belief 
that they are wish fulfillments. Others appear to be, as Rivers! 
suggests, attempted resolutions of persistent problems and personal 
conflicts. Still others doubtless arise from and are continuations of 
recent, vivid sensory and stimulating ideational experiences. There 
appears to be no need for utilizing psychoanalytic concepts in inter- 
preting many, perhaps most, of these dreams. On the other hand, in 
some children the manifest dream and its ostensible meaning are 
doubtless subtle distortions of unacknowledged or unconscious motives 
and drives, to be understood only through rather thorough study of 
the child’s developmental history. In connection with the study of a 
child, information concerning dreams is one item among many which 
contributes to a complete understanding of the springs or drives to 
human action. Its significance and its place in child study, it appears, 
are frequently overemphasized. 





1 Rivers, W. H. R.: Conflict and Dream. New York: Harcourt, Brace and Co., 
1923. 








THE CAUSES OF SLOW READING: AN ANALYSIS* 
E. DONALD SISSON 


° Louisiana State University 


The central processes, as the fundamental bases for reading, 
have been given a steadily increasing importance by recent authors. 
The arguments, in the main, have come from the eye-movement 
workers for whom it represents a swing away from the emphasis on 
peripheral factors. Before Javal’s pioneer studies on eye-movements, 
educators seemed to be aware of the fact that it was the individual who 
did the reading. But beginning with Dearborn’s® monograph on the 
eye-movements of readers, some investigators appeared to be of the 
belief that all reading deficiency lay in eye-movement deficiency. 
The dichotomy of central versus peripheral factors became more 
acute, and reading investigators took sides. There were, on the one 
hand, those who persisted in stressing the importance of the indi- 
vidual, among them Whipple,®° Judd,?* Gates,!?:!%5+ Vernon, 
Tinker, *!4?48 Sisson,***° end Anderson.! The motto of this faction 
would appear to be the statement, frequently found in their writings, 
that “eyo-movemente are —aghee rather than causes of underlying 
central deficiencies.”’ 

The opposed group stems from Dearborn® and includes such 
investigators as Gray," Pollock and Pressey,” Pressey,*®*! Ring and 
Bentley,*2, Danner,5 and Robinson.** The common characteristics 
in the publications of these men are the emphasis on the peripheral 
factors in reading, mainly eye-movements, and the insistence that 
remedial procedures should begin with the oculo-motor habits of the 
reader. For purposes of convenience we will consider these men as 
comprising the peripheral school. Closely allied with them are Betts,’ 
Eames,?:!9 *¢ **" and Taylor® who attach great significance to the 
visual factors in the etiology of reading deficiencies. The tenets 
of these men we shall consider later. 

This clear-cut division of opinion leads to the impression that 
the process of reading is analyzable into two distinct aspects—the 
central and the peripheral—and that it is the purpose of reading 





* The analysis in this paper is restricted to slow reading of individuals past the 
beginning stage. It, therefore, omits such factors in the etiology of so-called 
non-reading cases as strephosymbolia, hand-and-eye dominance, etc. 

¢ In the revision of his book on The Improvement of Reading,'* Gates seems to 
have changed sides. 
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investigations to derive the equation between the two. It is clear 
that if the issue is to be settled, some analysis of the reading process 
more adequate than has yet been attempted is necessary. The 
terms central and peripheral are vague and susceptible to many 
interpretations. They can be used to explain the unknown only in 
terms of the less known. And they raise the disturbing ghost of 
mind-body dualism. 

Reading is usually discussed under the topics of rate and com- 
prehension. This distinction is based on the _ peripheral-central 
dichotomy, rate supposedly being a function of the oculo-motor 
apparatus and comprehension the business of the central nervous 
system. Such analysis is arbitrary. As Anderson! points out, rate 
is the temporal dimension of the psychological processes concerned 
with the derivation of meaning from the printed page. Or, in the 
usual terms, rate is the temporal dimension of comprehension. The 
two do not exist as separate processes, but are two aspects of the same 
process; namely, reading. And so they vary concomitantly. A 
change in rate is accompanied by a change in comprehension, other 
things being equal; and any variation in the demands put upon the 
comprehending process is reflected in the rate. 

And is comprehension a unitary factor? One aspect certainly is 
intelligence. The usual methods of comprehension testing are identi- 
cal with those of intelligence testing, e.g., vocabulary knowledge, 
understanding directions, and grasping paragraph meaning. ‘There is 
no justification in multiplying concepts beyond the number of opera- 
tions employed. Reading comprehension, then, is partly a function 
of the intelligence of the reader, and we should expect readers of equal 
intelligence, reading at the same rate, to be equal in comprehension. 
Conversely, readers of equal intelligence, reading with the same degree 
of comprehension, would be expected to do it in the same time. 

Unfortunately, this is an over-simplification of the situation, 
since there exists a second aspect of comprehension that is not neces- 
sarily related to the intelligence aspect. This we have called the 
quality-quantity factor, or the factor of the kind and amount of 
material derived from the reading. It is a function, mainly, of the 
instructions given and of the set and interest of the reader. Judd and 
Buswell,?4 Gray,'® and Anderson! have shown that the reading per- 
formance varies with the reader’s attitude and aufgabe. If the goal 
is to reproduce the material verbatim, the reading is more irregular 
and requires a longer time, for the same individual, than if the goal 
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is to read for general idea, or to find the answers to specific questions 
(cf. Frandsen!'). Vernon*:4? has shown the reading performance 
varies with the reader’s interests and ‘“‘conative impulses.” And, 
recently, Pickford?’:> has pointed out the influence of the reader’s 
attitude and ideas and the author’s style, all of which are agents that 
appear to mediate between the reader’s ideas and the writer’s inten- 
tion. The influence of the reader’s attitude is shown in these state- 
ments from the earlier article: 


An attitude is an orientation, or mental set, which determines the specific 
line of interpretation in many ways. . . . An attitude may make way for the 
elaboration of subjective material, and therefore of meaning; . . . Attitudes 
may be over-determinate, and then interpretation of reading matter is 
hindered if attitudes are inappropriate, though a highly determinate attitude 
is helpful if well adjusted. 


The kind and amount of material derived and assimilated by the 
reader is thus based on intelligence and on a group of subjective 
factors such as attitudes, interests, appreciations, and preconceived 
ideas of what to look for and what to expect. There is no reason to 
suppose that these should be interdependent. In fact, Dewey’ has 
shown that there is no high correlation between the acquisition of 
facts from the material read and the “‘ability to evaluate, to read 
between the lines, and to understand the significance of what is 
read.”’ 

We might expect, then, that the quality-quantity factor will vary 
from reader to reader and from time to time, even though intelligence 
be held constant. Furthermore, it is probable that a substantial 
correlation will obtain between this factor and rate of reading when 
intelligence is controlled. 

But there are still other elements in the reading process which 
may be responsible for individual differences. One of the more 
important of these is variously entitled visual apprehension, word 
recognition, or, by Gates,'* “‘word perception.” Whether these 
different terms apply to the same phenomenon or not is not clear, but 
it is evident that most investigators believe in the existence of a trait 
having to do with the meaningful apprehension of the written lan- 
guage. That an affinity exists between visual apprehension and 
reading ability has been remarked by Gray,’ Hoffmann,”! Gates,” 
Litterer,”:?* and Swanson.** Visual apprehension span correlates to 4 
considerable degree with intelligence (Dallenbach,‘ Hoffmann,?! and 


are 


ge. 
tes 


reg 
ent 
the 


rea 
uni 
anc 
per 


of j 








> 
4 


al 
n 


ch 
re 
rd 
Se 
yut 
ait 


ind 





The Causes of Slow Reading 209 


Litterer*). But that intelligence and visual apprehension are not 
completely congruous is indicated by the results of Anderson and 
Fairbanks? that one ‘attribute of poor readers is inability to recognize 
words visually, although the reader may understand these in hearing 
them.”’ Inability to recognize material visually presented reaches a 
peak in the condition known as “‘word blindness’’ (Hinshelwood”*). 

The question as to whether or not visual perceptual span is amena- 
ble to practice has been attacked by several workers: (Dallenbach ;4 
Whipple; Gundlach, Rothschild and Young'*). The results would 
seem to favor some improvement, with habituation and grouping 
techniques largely responsible, but a limit of improvement appears to 
be rather quickly reached after a rapid initial rise. 

In recent years there has been much interest in other aspects in the 
etiology of poor reading. Betts,* Eames,*!° Taylor,*® Uhl,“ Wagner, 
and many others have emphasized the importance of such visual 
factors as muscle imbalance, restricted visual fields, fusion conver- 
gence, stereopsis, etc., as well as the usual visual acuity and refractive 
errors. Tests for many of these factors are incorporated in the 
Betts Ready to Read Tests battery which makes use of the ophthalmic 
telebinocular, a device which is finding increasing use as a diagnostic 
tool in remedial reading programs. In view of other evidence, it is 
probable that the value of these visual factors has been exaggerated 
as reasons for individual differences in reading ability. According to 
Swanson and Tiffin” the Betts tests of visual sensation and perception 
yield no significant differentiation between good and poor readers 
among college freshmen. They conclude that ‘‘sensory deficiencies 
are relatively unimportant as compared with the central and more 
generalized habits and capacities.’”’ From an analysis of the same 
test battery, Witty®! concludes that poor readers are not characterized 
by a higher percentage of visual deficiencies and anomalies than good 
readers. This. does not mean, of course, that visual defects are 
entirely unrelated to reading performance. It is pointed out, however, 
that visual defects may be hindrances to both good and poor readers. 

Other features of less consequence in reading ability are ocular 
reaction time and eye-movement time. The former is relatively 
unimportant owing to its extreme brevity (Dodge*—165-175 sigma), 
and the latter, because it accounts for only about 5.3 per cent to 9.6 
per cent of the total reading time (Tinker‘*!**). 


Our analysis yields the following factors as significant determiners 
of individual differences in reading ability: 
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(a) Intelligence, or the ability to comprehend the written word. 

(6) Perceptual span or word perception. 

(c) Quality-quantity factor, or that set of attitudinal and conative deter- 
miners of the kind and amount of product or by-product of the reading. 

(d) Peripheral visual factors. 

(e) Ocular reaction time and eye-movement time. 


We have omitted oculo-motor patterns, e.g., pause duration, 
regression frequency, fixation frequency, etc. as etiological factors, 
since they are merely the behavioral aspects of the above determiners. 
Furthermore, as has been pointed out, some of these are of greater 
importance than others. It is, of course, highly desirable that all 
visual defects, in good and poor readers, be detected early and cor- 
rected insofar as possible. But with all necessary corrections made, 
we shall still find large individual differences in reading ability. The 
last factors are unimportant for the reasons already given. 

We are left, then, with intelligence, perceptual span, and the 
quality-quantity factor. Of these, the first, by definition, cannot 
be markedly improved. The second is dependent on intelligence to 
some extent (Dallenbach,‘ Hoffman,”! and Litterer”), but is amenable 
to practice, at least to a limited extent, as has been stated above. 
Exercises calculated to improve word recognition and perception, 
then, might be expected to improve reading ability, within limits. 
This, of course, has been verified on many occasions. 

The third and most vital of these factors has been largely over- 
looked by reading investigators. It is based entirely on habit, and 
so is educable. Its basis is the habitual reading attitude. The slow 
reader, all other things being equal, is the careful, detailed reader. 
He is conscious of each word, each punctuation mark, and each 
typographicalerror. Or again, the slow reader may peruse the passage 
too passively, allowing any and all extraneous associations to intrude 
upon his thought, and lingering over passages that seem more interest- 
ing. Another person, on the other hand, may habitually skim the 
material, looking for main ideas, and actively forcing the reading 
through rigorous concentration. The slow, detailed reader may get 
more out of his reading, but is it worth while? The answer to such 
a question depends, of course, on the material or what should be 
extracted from it. And can there be any doubt that, in most cases, a 

rapid rate of comprehension is beneficial, if not essential? The quality 
and quantity of the slow reader’s derived product is different from that 
of his faster neighbor, but the difference is of varying importance. 
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Both may be equally successful in answering comprehensive ques- 
tions, because these do not usually get at the extra reveries and 
unessentials derived by the slow reader. 

Good reading is characterized by a plasticity and facility of change 
in mode of attack to meet the requirements of the particular situation. 
Anderson! has pointed out that one trait of poor readers is that they 
tend to read all materials in the same way. He states: ‘‘The fact that 
poor readers could not effectively adopt any other than their ordinary 
everyday reading attitude, . . . indicates that an essential difference 
between good and poor reading is the better ability of good readers to 
adapt their reading to different purposes.’’ Huey?? was aware of this 
point when he stated, ‘‘Bad form in reading is doubtless as distress- 
ingly common as bad form in swimming, skating, or tennis. ... ” 
He goes on to deplore the prevalence of ‘‘dead level plodding, with 
little thought of varying the speed according to the importance of 
what is being read.” 

Poor readers should be taught to vary the kind of reading to suit 
the purpose of the reading. They should be taught to select, to dis- 
criminate, and to skim. “The art of reading is to skip judiciously,” 
wrote Hamerton in his letters, “ . . . to skip all that does not con- 
cern us, whilst missing nothing that is really needed.’”’* The idea is 
by no means new or unique. Its postulation by Francis Bacon has 
the familiarity of an incontrovertible axiom: ‘‘Some books are to be 
read only in parts; others to be read, but not curiously; and some few 
to be read wholly, and with diligence and attention.’ 

Readers of low intelligence will require more detailed reading to 
assimilate the same amount than will intelligent readers. The 
former are more dependent on the objective stimuli of the text. But 
most slow readers can improve their reading speed, and the improve- 
ment is undoubtedly due, in part at least, to a change in the habitual 
reading attitude. 

By what techniques, then, can we increase speed of reading with 
no serious loss to comprehension? Since, as we have said before, 
speed or rate is the temporal dimension of comprehension, the way to 
increased speed would seem to lie in lessening the demands upon 
comprehension. Attend, therefore, only to that which is essential and 
important. Several devices have been found useful in this connection, 
t.e., the various methods of pacing eye-movements. These methods 





* Hammerton, P. G.: The Intellectual Life—The Power of Time. 
t Bacon, Francis: Essays—Of Studies. 
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work, not because they force eye-movements into a proper pattern 
(which they probably do not do), but because they force attention 
forward at a rate faster than is customary for the slow reader. The 
reader is thus forced to concentrate more rigorously, to get along with 
fewer cues from the text, and to fill in from the context. 

This same result can be achieved with motivated rapid reading 
without any pacing devices.** It is questionable, however, that any 
of these methods, unless long continued, will result in permanent 
improvement. Habits of slow, detailed reading are of long standing, 
and it is probably too much to expect that a few weeks of training will 
change them. Reading speed, like walking speed, may be habitually 
slow or fast. Both may be varied if attention is directed toward the 
processes involved. But both will usually lapse back to their cus- 
tomary condition when attention is directed elsewhere. The training 
program, then, regardless of the means used for treatment, must 
involve a periodic evaluation of progress, and must be of long enough 
duration to insure the establishment of a more efficient reading 
technique. Assuming the correction of any visual and vocabulary 
deficiencies, motivated rapid reading appears to be the most 
salutory single medium for the attainment of reading proficiency. 
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AN EXPERIMENTAL STUDY OF CERTAIN FACTORS 
INFLUENCING READING READINESS 


MARY CLARE PETTY 
The University of Texas 


When is a child ready to learn to read? Briefly stated, this is the 
problem of the present study. More specifically, it is a study of cer- 
tain factors influencing reading readiness. 

This problem is of great interest to teachers and is of no little impor- 
tance to their successful teaching in the field of beginning reading. 
Certainly the teachers themselves have realized the importance of the 
problem. The International Kindergarten Union” made a study of 
the subject, and more than ninety per cent of the five hundred sixty 
first-grade teachers who answered the questionnaire felt that they 
were expected to teach some children to read who were not ready. 
They estimated that twenty per cent of their pupils failed to show 
reading readiness when they entered school. 

The one hundred two subjects of this study were all pupils in the 
Austin public schools, and all but four were pupils of the investigator. 
Since this investigation was concerned with the problem of reading 
readiness, all the subjects were children in the low first grade. The 
Austin schools have neither kindergarten nor pre-primer classes, and 
the children were, therefore, entering school for the first time. 

In cases of children repeating the work of the low first grade, their 
reading achievement for only the first term they were in the grade was 
considered in any quantitative treatment of the material. 

All the children in a class were used as subjects regardless of the 
reading marks they were making with the exception of the first sixteen 
children tested. These sixteen formed a group for a preliminary study 
and included eight who had done “‘ very good work”’ and eight who had 
done “‘very poor” or “failing” work according to the teachers’ esti- 
mates. The four children whom the investigator has not taught were 
members of this preliminary group. The remaining eighty-six sub- 
jects, other than the preliminary group, were members of three differ- 
ent classes. 

Sixty boys and forty-two girls took the tests. The large number 
of boys was an accidental factor, for all children in a class were used 
as subjects. 

All the subjects were from English-speaking homes, and, therefore, 


no language or racial differences had to be taken into consideration. 
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A record of each child’s work was kept. The reading marks were 
used as the index of reading achievement. ‘This was considered an 
adequate measure of achievement since the relation of a child’s work 
to that of others in the group was of primary importance. Since one 
teacher did practically all the grading, the differences in teachers’ 
grading standards did not have to be taken into account. 

A case study of each child was also kept. Such records included 
any information which the teacher thought was of interest in interpret- 
ing the work the child was doing. Social, discipline, health, person- 
ality, and family problems were considered. 

The selection of factors to be considered in a study of reading 
readiness is a difficult task, for no one study can deal with all the 
factors involved. The present study is limited to a consideration of 
five factors: (1) Intelligence, (2) ability as revealed through an analyti- 
cal study of children’s drawings, (3) ability to deal with the symbols 
used in reading, (4) susceptibility to illusions, and (5) eidetic ability. 

Some explanation for the selection of these factors and a‘descrip- 
tion of the tests used are necessary. 


TESTS USED 


The importance of intelligence to reading readiness is generally 
conceded. Any study of the subject seems somewhat incomplete 
without a consideration of this factor. 

The Herring Revision of the Binet-Simon Tests was selected as the 
test of intelligence. This test was selected because a test individual 
in character but not too time-consuming was desired. 

A drawing test was selected for two reasons: (1) General ability, 
rather than special talent, seems to determine the ability to draw up 
to the age of eight or ten years; (2) it was hoped that drawings might 
reveal differences in perceptual types. The results of studies of 
children’s drawings by Buhler,®** Goodenough, and Peck™ substantiate 
the first reason. The second reason may be made clear by giving a 
description of the differences in children’s perceptual processes as 
suggested by Meili.*! According to him, there are some children who 
perceive the whole without analyzing the component parts and who 
may even have actual difficulty in seeing the parts, and others who are 
of a more analytical type and for whom the details do have a great 
deal of significance. A study of the drawings should reveal such 
differences in perceptual types if they do exist. A well-balanced, 
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highly-detailed drawing should indicate analysis, and a well-balanced 
but not detailed drawing should indicate synthesis. 

Peck and Manuel’s Non-Language Prediction Test for Young 
Children® was the drawing test chosen. Both Forms A and B were 
given. This particular test was chosen for two reasons; its attractive- 
ness to children and its scoring, which considers only the number of 
details included. 

Certainly reading is a complex perceptual process which deals 
with the recognition and interpretation of symbols. The test used to 
determine ability to distinguish between symbols is an adaption or a 
condensed form of the Lee-Clark Reading Readiness Test.° The test 
was mimeographed from a stencil cut with a typewriter which had type 
the size of that commonly used in primers. The test had two parts. 
The first part: was a criss-cross matching of letters. In a practice 
exercise, five pairs of letters were matched. This exercise was neither 
scored nor timed. The test itself consisted of matching ten such pairs 
of letters; the first five were capital letters, and the last five, lower 
case ones. The second part was composed of pairs of words which 
were identical except for the inclusion of one superfluous letter in the 
second word of each pair. The words in a pair were made identical 
by marking out this superfluous letter. The practice exercise, neither 
scored nor timed, consisted of four such pairs of words, and the test 
itself consisted of twelve pairs. 

This test was given either individually or to no more than two 
or three at a time. The practice tests were fully explained until the 
child apparently understood what was expected of him. In both 
parts, the score was the number of correct responses. The length of 
time taken to complete each part was recorded. 

The importance of analysis and synthesis in the reading process 
has been emphasized. Gray has shown that both analysis and 
synthesis are of importance to even trained adults in reading, but that 
synthesis is the ultimate goal of all reading instruction. Huey has 
divided readers into the subjective who recognize words from the total 
character rather than from the dominating parts and the objective 
who recognize the dominating parts first and for whom the effect of 
the total is minor. Surely the highly analytical child would be at a 
special disadvantage with present methods of teaching reading to 
beginners. 

This study raises the question of whether a very analytical child 
would likely be less susceptible to illusion than a child who is accus- 
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tomed to a more synthetic approach to problems. If such is true, a 
simple test of susceptibility to illusions should furnish a very practical 
method for the selection of a few children in each class who might 
better be taught reading by a more analytical method than that in 
general use today. 

Figure I shows the illusions used. The child was shown one figure 
at a time, and an attempt was made to ask the question in such a 
manner that the answer was not suggested. The specific questions 
asked were: 


Illusion 1.—‘‘Do you see these two lines? Are they or are they not the 
same length?””—After a negative answer, ‘“‘ Which is longer?”’ 

Illusion 2.—The same as for Illusion 1. 

Illusion 3.—‘‘ Are these two lines straight across or do they bend out?” 

Illusion 4.—‘‘Do you see these two middle circles? Are they or are they 
not the same size?’’—After a negative answer, ‘‘ Which is larger?” 

Illusion 5.—‘‘ Now tell me just what you see.” After any answer, ‘‘Can 


you see anything else?” ‘Time was given the child to consider this last 
question. 


There is a growing interest in eidetic imagery, but the relation of 
this form of imagery to school subjects has not been demonstrated. 
So far as the writer knows, there is no published study of the relation- 
ship of eidetic imagery to reading readiness. Allport® has suggested: 


The function of the E. I. seems to be to preserve and to elaborate a con- 
crete stimulus for the child in such a way as to intensify for him the sensory 
aspects of experience. By so doing, it enhances for him the meaning of the 
stimulus situation and enables him to repeat and to perfect his adaptive 
responses. 


If Allport is correct in this statement, eidetic ability should cer- 
tainly be of advantage to the child in learning to read. It seems only 
logical to expect that a child who has the ability to experience a very 
vivid, clear image of the word or sentence should be at an advantage 
in learning to read. 

All testing for eidetic imagery was done on clear days, and no 
artificial lighting was necessary. The light came from windows at the 
back of the child to be tested. The child was seated in front of 
the examiner at a small primary table. No head rest was used. The 
pictures were mounted on gray paper and were arranged in the order in 
which they were to be used. Sheets of gray paper were placed between 
the mounted pictures. The pictures were exposed and then slipped 
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away leaving the gray sheet of paper as a projection screen. 
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Both 


the pictures and the projection screens were held about fifteen inches 
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ILLUSIONS USED IN TESTING 


from the subjects. Before any test of eidetic imagery was given, three 
tests for after-images were given. This was done to familiarize the 
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child with the directions and to help him understand just what was 
meant by “‘really seeing something that is not there,” thus keeping 
him from confusing memory images and eidetic images. The dura- 
tions of the images in seconds were recorded; the coloring was also 
recorded. The objects the child reported as having seen were checked 
on the score sheet which contained the names of all the objects in the 
pictures. 

The six tests, including those of after-images, and the length of 
exposure are listed: 


I. Tests for after-images. 


1. A two-inch red square to be fixated ten seconds. 

2. A two-inch red square to be fixated twenty seconds. 

3. A seven-inch black silhouette of a boy running to be fixated fifteen 
seconds. 


II. Tests for eidetic imagery. 


1. Picture in colors (five by six and one-half inches) of a child mailing a 
letter, street details in background, to be examined without fixation for 
twenty seconds. . 

2. School scene in colors (five by six and one-half inches), in foreground 
children arranging flowers in vase, in background child placing posters 
on wall, to be examined without fixation for twenty seconds. 

3. School scene in colors (five by six and one-half inches), in foreground 
children examining books, in background child placing writing on 
board and teacher assisting pupil, to be examined without fixation for 
twenty seconds. 


The technique for eidetic testing used in this study is similar to that 
of other investigators. Kliiver,*® Jaensch,** Teasdale,*” and Peck and 
Walling“ all made use of the preliminary tests for after-images to pre- 
pare the subjects for the eidetic tests. These same investigators all 
did their testing in daylight, avoiding the problem of artificial lighting, 
and they did not use head rests when the size of the eidetic images 
was not to be studied. The selection of the time for exposing the 
stimuli was difficult, for the various investigators use different periods 
of time for exposure. The lengths of exposure of stimuli employed 
are those used by Teasdale,*” who made preliminary tests and found 
these exposures apparently the most satisfactory. The same time 
was used by Peck and Walling.** In reviewing the materials generally 
used in eidetic testing, both Jaensch* and Kliiver® state that dark 
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projection screens are generally used and consider pictures rich in 
details and interest suitable for eidetic testing. The pictures used 
in this study are both interesting and rich in details. Peck and Wall- 
ing** gave two series of tests; the first series of tests employed the pic- 
tures used in the present study, and the second series employed pic- 
tures of a silhouette type. They found the same subjects reported 
eidetic images on both tests. Not a child reported an eidetic image 
on the one series and failed to do so on the other series. These results 
lead to the conclusion that the use of silhouette rather than a photo- 
graphic type of picture would have made no difference in the selection 
of eidetic children. 

The children were all tested in a familiar situation by an examiner 
whom they knew well. No child had any apparent difficulty in under- 
standing the directions given. 

No correlation between reading achievement and chronological 
age was indicated. In fact, a very small negative correlation of —.08 
was found. 

It seems conservative to state that chronological age is not of 
importance to success in beginning reading when the members of a 
group do not vary greatly in age and have all reached a minimum age 
of six years, 

Mental age appears to be a very potent factor in the determination 
of reading readiness. A correlation of .52 + .050 was found for 
reading marks and mental age. The average MA for the entire 
group was eighty and two-tenths months, while the average MA for 
those children who made D in reading was seventy-three and six- 
tenths months. 

When the intelligence quotient, rather than mental age, was used 
as the index of intelligence, similar results were obtained. However, 
in this case, a slightly lower correlation was found. A correlation of 
48 + .05 between reading grades and intelligence quotients was 
found. 

A survey of the literature on the subject of reading readiness 
revealed several studies which seemed to indicate that a mental age of 
six years or six years, six months was necessary for standard first-grade 
work in reading. Twelve of the one hundred two subjects of the 
present experiment had mental ages of less than six years, and ten 
of these had to repeat the work of the low first grade. Of the thirty- 
four who had mental ages of less than six and one-half years, fourteen 
failed to pass to the high first grade after one term in school. 
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Certainly intelligence is a very potent factor in the determination 
of reading readiness, but it is equally true that the correlations found 
were low enough that the importance of other factors must be 
recognized. 

A correlation of .48 + .05 was found between the total score on 
Forms A and B of the Prediction Test and reading marks. The aver- 
age total score for all subjects was 68.4. The averages for different 
groups, which were given in Table I, indicate the positive relationship 
existing between reading achievement and success on the Prediction 
Test. 


TaBLE I.—AVERAGE TOTAL SCORES ON THE PREDICTION TEST 











| Mean Mean Mean 

The group | N score, score, total 

| form A | form B | score 

ia i ais 0 chsh 0 04 A Oa | 102 | 34.8 33.6 68.4 
ET Meererr errr ene | 60| 33.1 31.9 65.0 
TS ee ee ee ree 42 | 37.2 36.1 73.3 
Those who made A in reading...............| 33 | 38.9 38.1 77.0 
Those who made B in reading...............| 27 | 35.7 36.7 72.4 
Those who made C in reading...............| 24] 32.7 29.6 62.3 
Those who made D in reading............... 18 | 28.7 26.3 55.0 

















There appears to be another difference which cannot be expressed 
quantitatively between the drawings of those who were learning to 
read satisfactorily and the drawings of those who were failing to learn 
to read. The low scores of the ones doing satisfactory work appear 
to result from a high degree of synthesis. The low scores of the others 
appear to result from an inconsistency in the selection of details; 
some very minute details may be included while others which are more 
obvious are omitted. The first group seems to be highly synthetic 
when they score low on the Prediction Test, while the second group 
seems to be unsuccessfully analytic when low scores are made. 

The results of this study seem to indicate that a study of the 
drawings of children, such as that made possible by the use of the 
Prediction Test, is of value in determining reading readiness of begin- 
ners and that such a study is of greatest value when the drawings are 
considered both quantitatively and qualitatively. Further, it appears 
that such a study would probably be a sound basis for the selection 
of some children who should be taught by a more analytical method 
than that generally used. 
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The ability to deal with symbols seems necessary to success in 
beginning reading. The condensed form of the Lee-Clark Reading 
Readiness Test furnishes two measures of ability; the time required to 
complete the tests and the score. The two give correlations with 
reading marks which are not very different, for the correlations are 
.44 + .05 for the total score and .40 + .06 for the total time. 

A study of the various averages indicates the consistent superiority 
of the better readers, except in the case of score on Part I where no 
difference is indicated. ‘These averages may be found in Table II. 


TaBLe II.—Averace Score AND TimE ON CONDENSED Form OF THE LEE-CLARK 











TEst 
Part I Part II Total 
The group N 

Time | Score | Time | Score | Time | Score 
All subjects..................| 102 | 72.6 | 9.9 | 164.9) 10.6 | 237.5) 20.5 
det nck nace oed 60 | 70.1 | 9.9 | 172.3] 10.3 | 242.3) 20.2 
a5 so a aie wwe outa’ 42 | 76.2] 9.9 | 154.2) 11.0 | 230.3) 20.9 

Those who made: 
A in reading...............| 33 | 58.9 | 9.9 | 127.4) 11.3 | 186.3) 21.2 
I re 27 | 70.9 | 10.0 | 168.3) 11.3 | 239.2) 21.3 
C in reading...............| 24 | 68.6 | 9.9 | 181.6) 10.7 | 250.2) 20.6 
D in reading...............| 18 |105.8 | 9.9 | 205.9) 8.2 | 311.7) 18.1 


























Apparently a child must be able to distinguish between visual 
sensations of the kind experienced in reading even when the sensations 
are as similar as those experienced if the stimuli are ‘“‘it’”’ and ‘‘ite”’ 
or ‘can’ and “‘cdan.’”’ Otherwise, he is unable to deal with printed 
symbols and is not ready to learn to read. 

There is also evidence that too simple a test of this ability is not 
advisable. Lee, Clark, and Lee'! reported correlations ranging from 
nine to twenty-eight points higher than those obtained with the 
adaptation of the Lee-Clark Reading Readiness Test used in the 
present study. The test used in this study was primarily a condensed 
form of the Lee-Clark Test, and it is possible that the difference in 
correlations is due to the difference in length of the two tests. 

It is impossible to draw any positive and definite conclusions as to 
the relationship existing between reading readiness and susceptibility 
to illusions. Certainly more experimentation in this field would be 
necessary before making any such conclusions. However, it seems 
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conservative to say that there is some evidence that normal or exag- 
gerated susceptibility to illusions is the state most favorable to success 
in beginning reading, and that this may be true because the lack of 
normal susceptibility is possibly due to a highly analytical approach 
to the problem, and because the usual method of teaching reading is 
highly synthetic. The following evidence may be cited for the above 
statement: 

1. Those children who had to repeat the work of the low first grade 
experienced, on the average, only three and seven-tenths out of a 
possible five illusions; the average for all other groups and for all 
subjects was four. Thus it is seen that these “‘repeaters’’ were the 
only children who were differenciated from all others with regard to 
the susceptibility to illusions. This difference is significant since the 
sigma of the difference for the average for the ones who repeated the 
low first-grade work and the average for all other subjects was found 
to be .14; that is, the difference was more than twice the sigma of the 
difference. 

2. Five or twenty-seven and seven-tenths per cent of the eighteen 
children who experienced only two or three illusions made D in reading, 
and one or four and seven-tenths per cent of the twenty-one who were 
susceptible to all five illusions made D. The percentage of D’s for 
all subjects was seventeen and six-tenths. 

3. There is some indication that a high score on the Prediction 
Test is coupled with decreased susceptibility to illusions. The ten 
children who made a score of ninety or more on the Prediction Test 
reported an average of three and seven-tenths illusory experiences. 


The children who scored below fifty on the Prediction Test did not — 


differ from the whole group in the average number of illusions reported. 
However, thirty-three and three-tenths per cent of these reported 
five illusions, while not a child in the first group and only twenty and 
six-tenths per cent of all subjects reported as many as five illusions. 
It is possible to explain these facts on the basis of differences in per- 
ceptual types, the analytic and the synthetic. 


EIDETIC ABILITY AS A FACTOR 


Eidetic ability was found to be a factor in determining achievement 
in beginning reading.. The percentage of eidetic children among 
the good readers was found to be much higher than that among the very 
poor readers. Forty-one and one-tenth per cent of all subjects 
reported eidetic images. Although only twenty-two and two-tenths 
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per cent of those who had to repeat the low first grade were eidetic, 
fifty-one and five-tenths per cent of the ones who made A in reading 
reported eidetic images. The images of the better readers seem to be 
more detailed than those of the other children. Table III gives the 
average number of details for the different groups. The findings 
with regard to the duration of eidetic images are confusing. There is 
little difference in the duration of images for the various groups 
except for those who made C in reading; their average was only ten 
and four-tenths seconds as against nineteen and two-tenths seconds 
for allsubjects. An examination of Table III shows that there is some 
evidence that the eidetic children who are also superior readers report 
eidetic images on all three tests more often than do the eidetic children 
who are poorer readers. 


TasBLe III.—Summary or Data CoNcERNING ErpetiIc IMAGES 




















Groups 
‘AIL sub- | “A” “BRB” “oo” “pH” 
jects | readers| readers) readers | readers 
Number of subjects..................} 102 33 27 24 18 
Percentage reporting eidetic images...| 41.4 | 51.5 44.4 37.5 22.2 
Average duration in seconds..........| 19.2 | 21.0 | 22.8 | 10.4 18.9 
Average number of details............ 3.8} 4.4 3.5 3.1 3.1 
Percentage of eidetic subjects reporting:| 
Three eidetic images...............| 69.0 | 76.5 | 66.7 | 55.5 | 75.0 
Two eidetic images................ 16.6 | 11.7 16.6 | 22.2 | 25.0 
One eidetic image................. 14.3) 11.7 | 16.6 | 22.2 | 0.0 











Although eidetic ability appears to be a factor in the determination 
of reading readiness, the extent of the importance of this factor is 
questionable. Correlations between reading marks and the presence 
or absence of eidetic ability may be determined in varying ways. 
Table IV gives several such correlations. The median of all the 
correlations found is .26. When numerical values ranging from one 
for D to ten for A+ are assigned to the reading grades, the average 
for the eidetic subjects is 5.796, and the average for the non-eidetic 
subjects is 4.950. This difference is not entirely reliable, since the 
sigma of the difference for these groups is .57. 

The positive correlations given in Table IV cannot be explained 
as correlations of eidetic ability with intelligence rather than with 
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reading achievement, for correlations between intelligence quotients 
and the presence of eidetic ability for the subjects of the present study 
are very low. When the six methods listed in Table IV are employed, 
the average of the correlations between intelligence quotients and the 
presence of eidetic ability is only .17. 


TaBLeE IV.—TuHE CORRELATION OF READING MARKS AND EIDETIC ABILITY 


Metsops Usep For DetreRMINING THE CORRELATIONS CoRRELATION 
1. The product-moment coefficient of correlation for a fourfold 
a a aia Sea Watanabe ky aes eed ade A achie e OK .70 
Naso hae dew we ee ea .48 
ge di bw aad eee eee .26 
te I OP go oon wc kk od vckewieses banca .40 
5. Coefficient of contingency (corrected for possible maximum) .25 
6. Bi-serial coefficient of correlation. 
i os ica wt osenedaeeceeeneeuen .22 
EE ee .26 


* These methods for the determination of correlation are fully described and 
the formulae for obtaining them are given in Odell, C. W.: Statistical Method in 
Education. D. Appleton-Century Co., New York, 1935, pp. 309-325. 


Certainly the correlations found are high enough to show a positive 
relationship between eidetic ability and reading achievement in the 
low first grade and to make further investigation of the problem 
highly desirable. 

The tests for after-images were given primarily as aids to eidetic 
testing. However, the results indicate that the better readers report 
after-images more often and are more likely to report after-images on 
all three tests than are the poorer readers. The nature of the images 
and their duration in seconds show no differences that are significant 
for the various groups. 

The results of these tests may be summarized by stating that rich 
imagery appears to be a very real factor in determining success in 
beginning reading. 


SUMMARY AND CONCLUSIONS 


The present study is an experimental one of certain factors influ- 
encing reading readiness. 

The study used one hun¢red two subjects, all of whom were in the 
low first grade in the Austin public schools. Each subject was given 
five tests: (1) The Herring Revision of the Binet-Simon Tests, Form 
A, (2) Peck and Manual’s Non-Language Prediction Test for Young 
Children, Forms A and B, a drawing test, (3) a condensed form of the 
Lee-Clark Reading Readiness Test, (4) a test for susceptibility to 
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illusions, and (5) a test of eidetic imagery. Records of reading marks 
and personal case studies were kept for all the subjects. 

The present investigation leads to the following conclusions: 

(1) Chronological age is not of importance to success in beginning 
reading when the members of a group do not vary greatly in age, and 
when all the members of the group are at least six years old. 

(2) A correlation of .52 + .05 was found for reading marks and 
mental age. A correlation of .48 + .05 between reading marks and 
intelligence quotients was found. It may be said that intelligence is a 
very important factor in the determination of reading readiness, but it 
is equally true that the correlations found were low enough that the 
importance of other factors must be recognized. 

(3) A correlation of .48 + .05 was found for the total score on 
Forms A and B of the Prediction Test and reading marks. The results 
of this study indicate that a study of the drawings of children, such 
as that made by the use of the Prediction Test, is of value in determin- 
ing the reading readiness of beginners and that such a study is of 
greatest value when the drawings are considered both quantitatively 
and qualitatively. 

(4) The ability to distinguish between visual symbols is necessary 
to success in beginning reading. The condensation of the Lee-Clark 
Reading Readiness Test furnished two measures of such ability; time 
required to complete the tests and accuracy as indicated by the score. 
The two gave correlations with reading grades which are not very 
different, for the correlations are .44 + .05 for the total score and 
.40 + .06 for the total time. 

(5) No positive conclusions may be stated as to the relationship 
existing between reading readiness and susceptibility to illusions. 
However, there is some evidence that normal or exaggerated sus- 
ceptibility to illusions is the state most favorable to success in begin- 
ning reading. This may possibly be true because the lack of normal 
susceptibility is probably coupled with a highly analytical approach 
to the problem, and because the usual methods of instruction are 
synthetic. 

(6) Eidetic ability appears to be a factor in the determination 
of the success of a child in beginning reading. The median of a num- 
ber of coefficients of correlation between reading marks and the 
presence of eidetic ability was .26. . 

(7) While the positive relation of all the factors considered to 
reading readiness is indicated, it is true that none of the correlations 
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determined are high enough to be accurate in every case of individual 
prediction. An examination of some of the individual case studies 
shows that the explanation of many children’s work lies in home, social, 
health, disciplinary, or personality problems. 
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THE STABILITY IN PATTERN OF FACTOR LOADINGS: 
A COMMENT ON DR. SMART’S CONCLUSIONS 


LLOYD G. HUMPHREYS* 


Yale University 


THE SMART DATA 


Smart’ describes the purpose of a recent study as follows: “This 
study was undertaken to discover whether or not the factors remained 
stable when the number of variables was small, that is, whether or not 
factor loadings maintained their size and relative standing as more 
tests were added to the correlational matrix.”’ 

Smart obtained scores from sixty-six five-and-one-half-year-old 
children on the following eleven variables (the numbers will be used 
hereafter instead of test names): 


Minnesota verbal. 
Minnesota non-verbal. 
Arthur point scale. 
Chronological age. 
Stanford-Binet. 


FPrPrrY 


The McCarthy Language Survey (free and controlled situations): 


6. Average number of words in fifty responses, c. 
7. Average number of words in fifty responses, f. 

8. Percentage of pronouns in total number of words, c. 
9. Percentage of pronouns in total number of words, f. 
10. Average number of words in five longest responses, c. 
11. Average number of words in five longest responses, f. 


A typical procedure consisted of analyzing the first five variables 
alone and then making successive analyses as the McCarthy tests 
were added one at a time. As a result of this and other similar pro- 
cedures, carried to two factors by the 1933 Thurstone* method, Smart 
concluded that stable factors could not be found and that the change 
was more apparent in the second factor than in the first. 

Much of the basis for the second conclusion can be discounted by 
inspection of the results. Table I contains successive second factor 
loadings from two of Smart’s procedures and estimated from his 
graphs. It can be seen that he has disregarded the arbitrariness of 





* The writer is indebted to George M. Kuznets for assistance in the preparation 
of this paper. 
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second factor signs. For example, by multiplying the II’ columns 
through by — 1.00, they are brought more into line with the preceding 
loadings. 








TABLE I 
II II’ II II’ 
1 —.21 .00 1 — .20 — .08 
2 .14 — .42 2 .12 — .40 
3 .02 — .44 = 2  ~«asds — .42 
4 .72 — .36 4 . 66 — .32 
5 .40 — .30 5 .54 —.18 
6 45 6 — .26 .48 
7 — .26 .04 7 —.16 .02 
8 8 —.10 .30 
9 9 .10 .08 
10 — .35 .66 10 — .40 64 
11 — .58 .30 11 — .38 .28 




















After correcting second factor signs there is still some shift in factor 
pattern which remains. In order to check this the present author 
re-computed Smart’s analyses in the procedure described, using the 
published table of intercorrelations. ‘The most conspicuous change is 
observed between the analyses involving five and eleven variables. 
Table II contains these centroid loadings, including three factors for 
the eleven variables. The second factor residuals are large enough 











TaBLeE II 
Five-variable Eleven-variable 
I II h? I II III h? 
1 .50 .03 .25 1 .67 — .09 — .23 51 
2 .75 — .50 .81 2 .60 — .42 — .34 .66 
3 Be — .52 .86 3 .68 — .36 — .37 .73 
4 .50 37 .39 4 .29 — .41 .53 .53 
5 .65 63 .82 5 .60 —.19 .54 .69 
6 .39 .49 .03 .39 
7 .67 17 — .32 .57 
8 —.12 .28 15 .12 
9 13 .02 .27 .09 
10 56 .55 .14 .63 
11 85 42 — .16 93 
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by the usual criteria for a third factor. The standard deviation of the 
distribution of these residuals is .126, and the larger residuals are 
mainly found in connection with variables 4 and 5. 

Figure 1* illustrates the shift in factor pattern found by Smart. 
The encircled numbers are based upon the analysis of the first five 
variables alone; the plain numbers show the same variables when they 














+II ¢III 
F 
1 
. 4 4 
| ® . * 
4 © y 
P 
a 
q > 
, 
0 a 4 a 4 ———— 9, +I 0 » pone eee II 
1 > 
= y 1 
, 5 , 
q > 5 2 
: 8 | 
a ~ 
‘ ‘ j 
> 
; 
-Ii -III 
Fig. 1.—Encircled numbers from Fig. 2.—Variables from eleven- 
five-variable analysis, plain numbers variable analysis, the second factor being 
from eleven-variable analysis. plotted against the third. 


are included in the entire battery of eleven, the third factor being 
disregarded. 

When the third factor is plotted against the second, however, we 
obtain the results shown in Fig. 2. It can be seen that variables 4 
and 5 are again drawn apart from the rest, the same as they were when 


* In contrast to the tables, where the raw loadings have been presented, aug- 
mented factor loadings have been used in this and the following figure. Also, only 
one estimate of the communalities, the highest coefficient in a column, has been 
used in these and the Kelley-McNemar analyses. 
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only the five were analyzed. Factor patterns are thus shown to remain 
surprisingly stable in even small numbers of variables. 


KELLEY-MCNEMAR DATA 


In order to obtain a further check on the supposed dependence of 
factor patterns upon the variables analyzed, the data of Kelley and 
McNemar,! consisting of intercorrelations of twenty-one variables, 
were analyzed to three factors. Then seven variables were taken out 
at random and a further analysis undertaken. These results are 
presented in Table III. 


TABLE III 





Twenty-one variables Fourteen variables Fifteen variables 





I II | IfI | A? I It | Ill | A? I II h? 

















0 . eo oe eS eee eer eer ..- | .29 |—.23) .67 
1 .48 | —.25)—.48] .52 | ..46 | —.29) —.44) .48 

2 .58 | — .28) — .06) .42) ... |.....]..... ... | .041—.30] .39 
3 .73 |—.23) .23) .64 | .75 |—.22) .23) .66 | .75 |—.26) .62 
4 .71 |—.28) .31) .67 | .71 |}—.19} .20) .58 | .74 |—.29) .64 
5 .65 |—.23} .21) .51 | .65 |}—.29) .28) .58 | .67 |—.28) .53 
6 .60 |—.08|}—.28) .45 | .56 |—.12)—.19) .36 

7 .20 41; .06) .21 | .20 .42) — .13) .23 | .23 .40) .21 
8 .49 / Ff 6 ees eee ee 4 1 oe .37| .38 
9 42 .30} .23) .32 | .44 .27| .24) .382 | .48 .32| .33 
10 .52 .46} .18) .52 | .55 .46) .16) .54 | .57 .46) .53 
11 .33 .27| — .30) .27 

12 .28 |—.20)—.26) .19 | .26 | —.32)—.29) .26 

13 ee EO eee eee eee ... | .04 |—.38) .44 
14 .55 |—.23} .12) .387 | .54 |—.21) .05) .34] .56 |—.29) .39 
15 .62 |}—.30}) .24 .53 | .62 |—.34) .19) .54] .63 |—.36) .52 
16 .26 | —.23)—.17) .15 

17 ye | .32} .10) .12 | .12 .34, .17] .16 | .14 .24| .08 
18 .33 .42} .00) .28 | .34 . 38) —.18) .30 | .32 .38} .25 
19 .34 .08) — .27) .19 | .32 .08| — .27| .18 

20 .38 . ae oe oS eo ere eee cas ae .21) .20 


























As a further check, one of the three clusters composed of six vari- 
ables was removed and another analysis computed. It was now 
possible to obtain only two factors where three had been obtained 
prior to the removal of the cluster. The standard deviations of the 
distributions of the first and second factor residuals were reduced 
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from .101 to .068 in the twenty-one-variable analysis and from .112 
to .051 after removal of the cluster. These centroid loadings are also 
contained in Table III. 

It is not necessary to plot any of the above, as mere inspection of 
the table shows the lack of variability in factor pattern. There is 
little change in even the absolute size of the loadings. 


RUNDQUIST-SLETTO DATA 


Smart also made analyses of the data of Rundquist and Sletto.? 
These data consisted of the intercorrelations of six attitude scales in 
nine groups of subjects. The resulting nine analyses were again 
carried to two factors. They led Smart to state that the only con- 
sistency observed was that the first factor loadings for the first scale 
were consistently largest. He concludes: ‘‘There is also a striking 
change in factors from one group of subjects to another with the scales 
kept the same for all groups.”’ 






































TaBLe IV 
Groups, I, II, | III, | IV, | V, | VI, | VII, |}VIII,} 1X, 
N 500 | 100 | 100 | 100 | 50 | 500 | 100 | 100 100 
Scales 

ee ere ey .89 | .97 | .98 | .77 | .84] .81]| .77] .82]| .89 

eee meee a 52] .55] .41 | .49] .70} .51] .27] .55] .61 

sii ed ae aa inh ee eee 51] .45 |] .45] .51 |] .47] .52 | .45 |] .68] .57 
pee rrr .64 |} .55 | .66| .66] .80 | .67 | .54] .69| .67 

Ye re eee .26 | .82 | .37] .19 | .48 |] .85 | .36] .35 | .43 

Rete Sea ee ed 564] .51] .51 | .56| .57) .51 | .16]) .70 | .57 
SD (residuals) ......... .045) .048} .074) .078) .114) .070) .105) .090) .078 

Highest residuals........ .07 | .09 | .16] .138 | .22] .14] .24] .14] .13 
rr TTT TS .045) .100} .100) .100) .141) .045) .100) .100) .100 





The author and others* re-computed the above analyses to one 
factor, making several approximations until stable communalities 
were obtained. These loadings are contained in Table IV, along with 
the standard deviation of the second factor residuals, the highest 
residual, and the standard error of a zero-order r for the appropriate 
groups. 





* The writer is indebted to four members of Mr. Kuznet’s class in factor analysis 
for the first four of the present analyses. 
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A comparison of the last three rows in the table shows that Smart’s 
second factor can probably be disregarded for all of the groups except 
VI and VII. Only in these two groups are the residuals large enough 
to warrant a second factor, it being omitted here in order to facilitate 
the comparison. 

Inspection of the first factor loadings indicates that groups V and 
VII are most atypical. As the loadings in group V would be expected 
to fluctuate more than in the others because of the smaller number of 
cases (fifty), Smart’s best case for change in factor pattern in different 
groups remains group VII. 

The most nearly comparable loadings, on the other hand, are those 
of groups I and VI. One can only conjecture whether this is due to 
the larger number of cases (five hundred) and the resulting stability 
of the intercorrelations or to the comparability, except for sex, of the 
two samples. The question of sampling errors or group differences is 
also raised in conjunction with the larger, but still relatively small, 
differences in pattern exhibited by the remaining groups. It is 
probably unwise to conclude, with the possible exception of group VII, 
that the variations observed are group differences when we have so 
little knowledge of the effect of sampling on factor loadings. 


SUMMARY 


Smart, by changing the variables in small correlational matrices in 
successive analyses, had concluded that stable factors could not be 
found and that more change was apparent in the second factor than 
in the first. These conclusions have been investigated in two studies. 

From an analysis of Smart’s own data it was discovered that some 
of the lack of stability could be explained on the basis of the arbi- 
trariness of second factor signs. Most of the remaining instability 
was found to be due to his omission of a possible third factor in the 
battery. This third factor explained the apparent erratic behavior of 
two of the variables between analyses based upon five and eleven 
variables, respectively. 

In the second part of the study a matrix of twenty-one variables 
taken from Kelley and McNemar was analyzed. Then, successively, 
two further analyses were made for comparison, one in which seven 
variables were removed at random, the other in which six variables 
constituting one cluster were removed. Little change in pattern was 
apparent in either. In addition, one factor was lost in the second 
situation. 
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From an analysis of data from Rundquist and Sletto, Smart con- 
cluded that there was also a striking change in factor pattern from 
one group of subjects to another, the scales being kept the same for 
all groups. When this conclusion was investigated, it was found that 
Smart’s second factor could be disregarded in all but two of the groups, 
and only one group gave sufficient evidence of first factor variability 
to support the conclusion. It was suggested that the remaining vari- 
ability might be due to sampling errors rather than group differences. 
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